Skip to content

Selection fixes

Bharath Raghavan requested to merge sele_fixes into master

After the MPT format was changed, I went ahead and tried to fix the ResID problem. I made the following main changes in the way the selection language works:

  • Before, the full dataframe had to be generated by getSystemTopology() so that the selection can be done using the easy to use pandas boolean. But, I found that such a large dataframe was taking too much memory and was not efficient. Memory estimation is a bit tricky with python, but it looked like for my system the dataframe was around 100 MB.
  • So, now it stores each column of the dataframe as a list internally. Lists can be grown (appended to) faster than dataframes and numpy arrays, and take up much less memory.
  • But, selecting (esp. multiple elements) is much faster with numpy. So, for the selection each of these lists is converted into a numpy arrays. The selection language is then, converted into something np.where() can understand instead of the pandas boolean.
  • All lists are generated when the MPT() constructor is called (if only writing is to be done, mode='w' can be passed to not generated the lists and save time). ResID is its own list, and it has to be separately generated. It takes a bit longer than the other lists to generate. I have not found. a way to shorten this time. Anyways, the whole loading takes around 1.5 s on local machine and 2.5 s of remote machine so it should be fine.
  • Also, a rudimentary version of the VMD selector has been added. I do not have a stable working version of VMD, so have not been able to test it out properly.

I've tried to put comments for clarity. We will discuss about this pull request anyway. Please go through this branch and test it well, only then we'll merge it.

Merge request reports