Skip to content

tcp: Add memory barrier to tcp_push()

JIRA: https://issues.redhat.com/browse/RHEL-22708
Upstream Status: linux.git

commit 7267e8dcad6b2f9fce05a6a06335d7040acbc2b6
Author: Salvatore Dipietro dipiets@amazon.com
Date: Fri Jan 19 11:01:33 2024 -0800

tcp: Add memory barrier to tcp_push()  

On CPUs with weak memory models, reads and updates performed by tcp_push  
to the sk variables can get reordered leaving the socket throttled when  
it should not. The tasklet running tcp_wfree() may also not observe the  
memory updates in time and will skip flushing any packets throttled by  
tcp_push(), delaying the sending. This can pathologically cause 40ms  
extra latency due to bad interactions with delayed acks.  

Adding a memory barrier in tcp_push removes the bug, similarly to the  
previous commit bf06200e732d ("tcp: tsq: fix nonagle handling").  
smp_mb__after_atomic() is used to not incur in unnecessary overhead  
on x86 since not affected.  

Patch has been tested using an AWS c7g.2xlarge instance with Ubuntu  
22.04 and Apache Tomcat 9.0.83 running the basic servlet below:  

import java.io.IOException;  
import java.io.OutputStreamWriter;  
import java.io.PrintWriter;  
import javax.servlet.ServletException;  
import javax.servlet.http.HttpServlet;  
import javax.servlet.http.HttpServletRequest;  
import javax.servlet.http.HttpServletResponse;  

public class HelloWorldServlet extends HttpServlet {  
    @Override  
    protected void doGet(HttpServletRequest request, HttpServletResponse response)  
      throws ServletException, IOException {  
        response.setContentType("text/html;charset=utf-8");  
        OutputStreamWriter osw = new OutputStreamWriter(response.getOutputStream(),"UTF-8");  
        String s = "a".repeat(3096);  
        osw.write(s,0,s.length());  
        osw.flush();  
    }  
}  

Load was applied using wrk2 (https://github.com/kinvolk/wrk2) from an AWS  
c6i.8xlarge instance. Before the patch an additional 40ms latency from P99.99+  
values is observed while, with the patch, the extra latency disappears.  

No patch and tcp_autocorking=1  
./wrk -t32 -c128 -d40s --latency -R10000  http://172.31.60.173:8080/hello/hello  
  ...  
 50.000%    0.91ms  
 75.000%    1.13ms  
 90.000%    1.46ms  
 99.000%    1.74ms  
 99.900%    1.89ms  
 99.990%   41.95ms  <<< 40+ ms extra latency  
 99.999%   48.32ms  
100.000%   48.96ms  

With patch and tcp_autocorking=1  
./wrk -t32 -c128 -d40s --latency -R10000  http://172.31.60.173:8080/hello/hello  
  ...  
 50.000%    0.90ms  
 75.000%    1.13ms  
 90.000%    1.45ms  
 99.000%    1.72ms  
 99.900%    1.83ms  
 99.990%    2.11ms  <<< no 40+ ms extra latency  
 99.999%    2.53ms  
100.000%    2.62ms  

Patch has been also tested on x86 (m7i.2xlarge instance) which it is not  
affected by this issue and the patch doesn't introduce any additional  
delay.  

Fixes: 7aa5470c2c09 ("tcp: tsq: move tsq_flags close to sk_wmem_alloc")  
Signed-off-by: Salvatore Dipietro <dipiets@amazon.com>  
Reviewed-by: Eric Dumazet <edumazet@google.com>  
Link: https://lore.kernel.org/r/20240119190133.43698-1-dipiets@amazon.com  
Signed-off-by: Paolo Abeni <pabeni@redhat.com>  

Signed-off-by: Antoine Tenart atenart@redhat.com

Merge request reports