gkcore codebase improvements discussion
This issue is to discuss about current GNUKhata setup and possible improvements that can be done to make the development easier and codebase scalable.
Current Scenario
JWT based access management
- A token (
gkusertoken
) is sent to user when a user is logged in successfully. This token will containuserid
andusername
. - A new token (
gktoken
) is sent to user when user selects an organization from organization selection screen. This token will containuserid
andorgcode
. -
gktoken
is sent with every GET, POST and PUT request heads. - Inside each API function/method, user ID and organization code is extracted from the token. If token is absent or on failure to extract user ID,
gkstatus
2
is returned as response. - Could not find user role based permission management.
Database Connection Management
- Database engine is created at
gkcore/models/meta.py
and returned fromdbconnect
function. -
dbconnect
is called ingkcore/__init__.py
and assigned toeng
(along with creating secret string inside a try except block). - This engine is managed as an API class attribute and inside each API method, connections are used in a try-except block, giving gkstatus for connection failed on exception. Some exceptions are custom handled to return duplicate entry response.
- We are using mostly the
Core
features of SQLAlchemy andORM
functionalities are not touched.
Note: Currently we are using SQL Alchemy 1.3.20
, which is legacy now.
Refactoring steps
A branch with refactored code can be seen here - https://gitlab.com/gnukhata/gkcore/-/tree/backend-refactoring-poc. Project configuration and Unit of Measurement API is refactored in that poc branch.
Organize code
- Move database related code to
gkcore/models/meta.py
- 90ba2df6 - Move routing to a separate file
gkcore/routes.py
- 5e47a2b8 There is an instance of circular importing betweengkcore/__init__.py
andgkcore/utils.py
.
Add security policy
- Add a basic security policy that checks for authentication and allow all permission to admin user role. Security polity strictly follows current authentication mechanism - 490cc306
Refactor Unit of Measurement API
This API was selected because it was a small API with 245 loc, which was reduced to 165 loc after refactoring - 29f3b5df
- Use newly added security policy implemented to API by adding permission string to
@view_config
decorator. - Use context manager to reliably handle DBAPI connections.
Consequences
- Since we are not suppressing exceptions, server will raise 5xx errors. Same is the case with permission management, the requests will be in HTTP response status codes instead of always sending 200 status with a gkstatus value. Front end will have to respect HTTP status codes.
-
userAuthCheck
andauthCheck
will deprecate. Even though the current poc will support old and new authentication handling, it is suggested to deprecate the old one eventually and also add proper support for user role based permission management.
Benefits
- API level user role based permission management can be implemented.
- API codebase complexity can be removed along with lines of code.
- API will be debug-able since exceptions are not suppressed.
- With proper rollback support and isolation, code base can be refactored to be ACID compatible. Which will make the stored data more reliable.
Migration planning
- The current refactored project configuration can support both old and new API implementation simultaneously.
- We could test the new API structure in couple of frequently used APIs.
- If the API tests are not failing, we can migrate APIs one by one to new structure over time.
Extra
Overriding exception handling
view_config
decorator can be used to handle exceptions too. So we individually manage each exceptions if we want. The below code will send response status 500 along with exception error text if any API raise any exception. At the same time, it will print the traceback
on console.
Even if decide to go with sending status 200 with gkstatus
value, this can be used to send custom responses on exceptions.
diff --git a/gkcore/views/__init__.py b/gkcore/views/__init__.py
index e69de29b..c8159909 100644
--- a/gkcore/views/__init__.py
+++ b/gkcore/views/__init__.py
@@ -0,0 +1,12 @@
+from pyramid.view import view_config
+import sys, traceback
+
+@view_config(context=Exception)
+def exception_view(error, request):
+ traceback.print_exc(file=sys.stdout)
+ # set debug status to show error message as response
+ request.response.status = 500
+ if DEBUG:
+ request.response.json_body = {"error": f"{error}"}
+ request.response.json_body = {"error": "Oh no! Something terrible happened."}
+ return request.response
Query output serializer
Currently we are manually assining keyvalues to create response json. This is because SQLAlchemy result is not always serializable. Multiple strategies are listed here - https://stackoverflow.com/questions/5022066/how-to-serialize-sqlalchemy-result-to-json.
Using SQLAlchemy ORM
Currently we are using the core features only (https://docs.sqlalchemy.org/en/20/intro.html). Benefits of using SQLAlchemy ORM needs to be checked.
Migrations Management
Currently we are adding migration files in db_migrate.py
file using raw SQL. This is not scalable and will make management of database changes difficult. Having a dedicated migration management tool will help us split up the current single migration file and version them. The main advantage will be the capability of moving up and down through this versions.
Alembic is a great migrations manager that is being used along with SQL Alchemy - https://alembic.sqlalchemy.org/en/latest/.
Back-end validation
Currently with do not have project wide implementation of a back end validation mechanism. A back end validation mechanism will help to reduce data errors and improve security. Suggestions are, Colander with ColanderAlchemy - Crafted to use in Pyramid with SQLAlchemy, have auto schema generation capabilities. (https://docs.pylonsproject.org/projects/colander/en/latest/, https://colanderalchemy.readthedocs.io/) Pydantic - One of the most used python validation library (https://docs.pydantic.dev/latest/)
References
- Pyramid security documentation: https://docs.pylonsproject.org/projects/pyramid/en/latest/narr/security.html
- SQLAlchemy Core Connections documentation: https://docs.sqlalchemy.org/en/13/core/connections.html
- SQLAlchemy Session API documentation: https://docs.sqlalchemy.org/en/20/orm/session_api.html
- SQLAlchemy ORM Session creation FAQ: https://docs.sqlalchemy.org/en/13/orm/session_basics.html#session-faq-whentocreate
- Pyramid Configurator API documentation: https://docs.pylonsproject.org/projects/pyramid/en/2.0-branch/api/config.html
- View Config documentation: https://docs.pylonsproject.org/projects/pyramid/en/2.0-branch/narr/viewconfig.html