Skip to content

[#87][#44][#139] Chunked encoding & misc

Sam Habiel requested to merge shabiel/YDB-Web-Server:mws87-chuncked into master

Chunked Encoding

Summary

This commit implements both sending and receiving of Chunked-encoded data according to https://datatracker.ietf.org/doc/html/rfc9112#section-7.1.

Receiving is done by implementing the hitherto unimplemented rdchnks^%ydbwebreq. This required some refactoring of other parts of the routine. Tested by tRecChuncked^%ydbwebtest which calls /test/postchunked (chunkedpost^%ydbwebapi).

%ydbweburl now supports a 4th data item. This can be anything, but now we only support chunkCallback= will lets the POST url implement a callback routine when each chunk is received. URL test/postchunkedinc sends each chunk through chunkedpostincread^%ydbwebapi, which puts it in a variable chunkedread, and final processing is done in chunkedpostinc^%ydbwebapi, which takes chunkedread and puts it in httprsp. Tested by tRecChunckedInc^%ydbwebtest.

Sending is done new code in %ydbwebrsp. The code that sends data was refactored into rsptype1 and rsptype2. As part of that refactoring, gzip was refactored. Tested by tSendChunked which calls /test/getchunked (chunkedget^%ydbwebapi).

Gzipping with Chunked portions is currently not supported.

Auto-encoding JSON in Chunked portions is not done.

API to Send Chunked Data

We will refer to the following example when discussing how to send chunked data

1 set httprsp("mime")="text/plain; charset=utf-8" ; Character set of the return URL
2 set httprsp("chunked",1)=$name(data1)
3 set httprsp("chunked",2)=$name(^data2)
4 set httprsp("chunked",3)=$name(^data3("foo"))
5 set httprsp("chunked",4)="chunkcallback1^myroutine"
6 set httprsp("chunked",5)="chunkcallback2^myroutine"

The important part to note here is that to send chunked data, you set httprsp("chunked",n) to something. The example above sends 5 chunks (actually, the last one sends multiple chunks, so it's more than 5).

Lines 2-4 are mostly the same: we send data in a local variable or global variable. In this case, the chunk calculation is automatically done, and as a developer, you don't need to do anything else.

Things get more interesting with lines 5 and 6. These are routine callbacks, allowing you to custom produce chunks dynamically:

Here's chunkcallback1^myroutine, which sends a single chunk:

        new oldio set oldio=$io
        new file set file="/mwebserver/r/_ydbwebtest.m"
        ; Get file size
        open "D":(shell="/bin/sh":command="stat -c%s "_file:parse):0:"pipe"
        use "D"
        new size read size
        use oldio close "D"
        ;
        ; Send hex size
        do:httplog>2 stdout^%ydbwebutils("Sending chunk with size "_size)
        new hexsize set hexsize=$$dec2hex^%ydbwebutils(size)
        do w^%ydbwebrsp(hexsize_$char(13,10))
        ;
        ; read and send file
        ; Fixed prevents Reads to terminators on SD's. CHSET makes sure we don't analyze UTF.
        open file:(rewind:readonly:fixed:chset="M")
        use file
        ; hang simulates that we are sending lots of data slowly
        new x for  read x#4079:0 use oldio do w^%ydbwebrsp(x) hang .01 use file quit:$zeof
        use oldio close file
        ; now send end of this chunk (CRLF)
        do w^%ydbwebrsp($char(13,10))
        quit

Here's chunkcallback2^myroutine, which sends multiple chunks. Note the use of sendonechunk^%ydbwebrsp, which is provided for the convenience of the user. All you have to do is get data and call sendonechunk as many times as you want.

chunkcallback2 ; Test Callback for writing multiple small chunks
        new oldio set oldio=$io
        ; This file was chosen for testing as that exercises flush^%ydbwebrsp because it's larger than 32k
        new file set file="/mwebserver/r/_ydbwebtest.m"
        ;
        ; Get file size (for verifying that we sent the full file)
        open "D":(shell="/bin/sh":command="stat -c%s "_file:parse):0:"pipe"
        use "D"
        new fullsize read fullsize
        use oldio close "D"
        ;
        ; read and send file in chunks
        ; Fixed prevents Reads to terminators on SD's. CHSET makes sure we don't analyze UTF.
        open file:(rewind:readonly:fixed:chset="M":nowrap)
        use file
        new incsize,size set incsize=0
        new x for  read x#4079:0 quit:$zeof  set size=$$sendonechunk^%ydbwebrsp(x),incsize=incsize+size
        use oldio close file
        do:httplog>2 stdout^%ydbwebutils("full size: "_fullsize_" sent size: "_incsize)
        if fullsize'=incsize set $ecode=",U-signal-error,"
        quit

API to Receive Chunked Data

You normally do not need to do anything to receive chunked data. It will automatically work. However, in case the chunks you receive are very large, you can add a callback in _ydbweburl like this:

        ;;POST test/postchunkedinc chunkedpostinc^%ydbwebapi chunkCallback=chunkedpostincread^%ydbwebapi

For each chunk, we call chunkedpostincread^%ydbwebapi. In that code, you can do what you want with httpreq("body"), and then httpreq("body") is killed after that, avoiding the need to hold a large amount of data in memory. Here's a sample implementation:

chunkedpostincread ; Incremental read of each chunk
    merge ^chunkedread($increment(^chunkedread))=httpreq("body")
    quit
    ;
chunkedpostinc ; POST /text/postchunkedinc Incremental Read Chunk Test
    new charcount set charcount=0
    set httprsp("mime")="text/plain; charset=utf-8" ; Character set of the return URL
    new i,j for i=0:0 set i=$order(^chunkedread(i)) quit:'i  for j=0:0 set j=$order(^chunkedread(i,j)) quit:'j  set charcount=charcount+$zlength(^chunkedread(i,j))
    kill ^chunkedread
    set httprsp=charcount_" bytes received "_$char(13,10)
    quit

Miscellaneous

Miscellaneous bugs found during development:

  • #44 (closed): If a global contained only a single value (e.g. ^web3="foo"), it was not killed. That's incorrect, and it's now fixed. It was working correctly if the global had subscripted data inside of it. Added /test/gloreturn3 URL to verify that data gets deleted when sending this kind of global. Tested by tGlo^%ydbwebtest.
  • #139 (closed): The code for data that was gzipped did not fix issue #139 (closed) (previous commit) since it was duplicated and it was missed when #139 (closed) was done. The code is now shared, and tests were added to tGlo^%ydbwebtest to ensure all globals are appropriately queried and killed in gzip mode.
  • As a result of #139 (closed), the tests for gzip where significantly enhanced to ensure proper code coverage gzip mode.
  • Handling of 100-Expect broke in the past (don't know when), and there was no test to catch that. It broke because an extra CRLF (the delimiter on the connection) was sent over. Added tExpect test to specifically test for this.

Misc changes:

  • Commented calls to ydb_env_set added to ci/run_test.sh to create a database for testing. The web server does not expect a database, so keeping these commented out keeps us from introducing a database dependency by mistake.
  • Add mode server-gzip to ci/run_test.sh to allow developers to run the server with gzip.
  • Push version number up to 4.5.0.
Edited by Sam Habiel

Merge request reports