We should implement a handler that can hijack JSON encoded body, stripping base64 encoded files and replacing them with the usual metadata workhorse generates on upload acceleration.
@nolith I also think this makes sense. I wonder how we would detect files though? E.g. which JSON field in an NPM JSON body gets treated as an "upload". Is there a way for us to know, or do we have to register each "JSON-hidden upload" route in workhorse along with the upload fields?
@sabrams@10io@hortiz5@michelletorres I was looking to prioritize some performance enhancements and remember discussing moving npm to workhorse. I see we have this issue and #13078 to do so. What is the expected impact of this change? Can we do this work or do we need help from other teams?
Are there other performance or reliability issues that you think are higher priority than this?
Read the uploaded file from the json encoded body instead of the whole body or a multipart form.
The rest of the upload processing (in particular, the :ping_pong:s stays the same).
The uploaded file should be in uploaded file params (the ones that are JWT signed) as usual.
Consider using the same approach as for multipart uploads as we could have multiple files in a single json structure.
caution must be taken here to not put the whole body in memory to (json) decode it.
Ideally, this change should be done in a generic way so that we can programmatically code where the uploaded file is in the json structure (perhaps using the json path with the dot notation)
This is to support any upload that would be json encoded
Rails changes (weight 2)
Implement the /authorize endpoint
Update the upload endpoint to actually read the file passed by Workhorse (and processed by the rails middleware)
(2.) is the usual code area of the package team. I don't see any major obstacle here.
(1.) is limited in scope (it's just a matter of reading the uploaded file in a different way that we currently can) but we have less visibility here as it's not our usual code area although we already worked on package uploads. This change has more grey areas for us (package team). That's why I would put a 3.
Having said that, this provides a good opportunity for the Package team to extend its knowledge on Workhorse in general, as it is a piece of code we need to look at (and update) from time to time. We might need to layout a solution for (2.) and ping a Workhorse maintainer to acknowledge the overall direction.
Thank you @10io, that is really helpful. Looking at the plan for milestone 14.4, it's looking tight to take on this project. But maybe we can take one step forward to create a layout for (2.) and ping a Workhorse maintainer? Then we can schedule the implementation issues in a later milestone.