vfs_glusterfs: fix directory fd leak via FSP extension destructor
When Samba closes a directory backed by vfs_glusterfs, the glfs_fd_t opened by vfs_gluster_openat() is never closed. This leaks one libgfapi file descriptor and one server-side fd_t in glusterfsd per directory open/close cycle. With persistent SMB2 connections the leak is unbounded and drives monotonic RSS growth on the GlusterFS brick process.
vfs_glusterfs creates two independent glfs_fd_t handles per directory: one via glfs_open() in vfs_gluster_openat(), stored in the FSP extension, and another via glfs_opendir() in vfs_gluster_fdopendir(), tracked by struct smb_Dir. On close, smb_Dir_destructor() closes the opendir handle and sets the pathref fd to -1. fd_close() then returns early without calling SMB_VFS_CLOSE, so vfs_gluster_close() never runs and the glfs_open() handle is orphaned. The original code passed NULL as the destroy callback to VFS_ADD_FSP_EXTENSION, so there was no safety net.
The default VFS does not have this problem because fdopendir(3) wraps the existing kernel fd rather than opening a new handle. libgfapi has no equivalent -- glfs_opendir() always creates an independent handle by path. The actual glfs_fd_t is stored in the FSP extension, not in fsp->fh->fd (which holds a sentinel value), so Samba's generic close path cannot reach it. When GlusterFS is accessed via a FUSE mount instead of libgfapi, the kernel enforces the fd lifecycle through FUSE_RELEASEDIR and no leak is possible.
The fix registers vfs_gluster_fsp_ext_destroy() as the FSP extension destroy callback. It calls glfs_close() on the stored pointer and is invoked by vfs_remove_all_fsp_extensions() during file_free(), which runs unconditionally for every fsp. In the explicit close path, vfs_gluster_close() NULLs the extension pointer before calling VFS_REMOVE_FSP_EXTENSION to prevent double-close. This follows the same pattern used by vfs_ceph_new.c (vfs_ceph_fsp_ext_destroy_cb).
Observed on a production file server with 32 persistent SMB2 connections and continuous directory operations. GlusterFS brick statedumps showed fd_t pool growth from 1,993 to 80,350 active instances over 6 days, roughly 13,000 leaked fds per day per brick.
One file changed (source3/modules/vfs_glusterfs.c), 37 insertions, 2 deletions. No changes to the extension data type, directory operations, or any of the 30+ vfs_gluster_fetch_glfd call sites. Regular file close behavior is unchanged.
No in-tree test is possible as vfs_glusterfs requires a live GlusterFS cluster. Built with ./configure.developer and -Werror in the upstream CI container (Ubuntu 22.04) with zero warnings from vfs_glusterfs.c. Tested with samba3.smbtorture_s3 file operation tests (25/25 passed) and samba3.vfs tests (all failures confined to nt4_dc environment setup, unrelated to this change).
Checklist
-
Commits have
Signed-off-by:with name/author being identical to the commit author - (optional) This MR is just one part towards a larger feature.
-
(optional, if backport required) Bugzilla bug filed and
BUG:tag added - Test suite updated with functionality tests
- Test suite updated with negative tests
- Documentation updated
- CI timeout is 3h or higher (see Settings/CICD/General pipelines/ Timeout)
Reviewer's checklist:
- There is a test suite reasonably covering new functionality or modifications
-
Function naming, parameters, return values, types, etc., are consistent
and according to
README.Coding.md - This feature/change has adequate documentation added
- No obvious mistakes in the code