Memory Leak When Processing Large Datasets with ImFusion Python SDK

volekay · February 21, 2025, 8:03am

Hello

I’ve been working with the ImFusion Python SDK to process large 3D medical imaging datasets & I’ve noticed a significant memory leak over time. When running scripts that repeatedly load, process & visualize images, memory usage keeps increasing until the system runs out of RAM. This happens even when explicitly deleting objects and calling gc.collect().

I’ve tried debugging with memory profiling tools & noticed that certain objects related to image handling and visualization are not being freed properly. It seems like there might be internal references within the SDK that prevent proper garbage collection. Restarting the script clears the memory; but that’s not a viable solution for long-running applications. Checked
https://stackoverflow.com/questions/17110694/memory-leak-with-large-dataset-when-using-mysql-python-MongoDB guide for reference.

Has anyone else encountered this issue? Are there recommended best practices for managing memory efficiently when working with large datasets in the ImFusion Python SDK? Any insights on manually releasing resources or forcing cleanup would be greatly appreciated.

Thank you !!

IljaManakov · February 21, 2025, 8:23am

Hi volekay,

thanks for bringing this to our attention.
Could you please elaborate on what your script is doing with the images or, better yet, provide a minimal code example.
We are using the Python SDK internally in long-running ML training jobs as well and have not observed memory leaks.

When you say “visualize” are you referring to imfusion.show()?
So far we are not aware of any memory leaks there but the functionality is new (and not used in any of our own pipelines) so the possibility that there is an undiscovered leak is definitely there.