causal mask when cross-attn

Hi, thanks for having this great repo. I just wonder if the causal mask should be used in the second forward pass when cross-attn is used. Or do I miss something here? Thank you