Change scoring metric implementations to take in whole dataframes
So instead of taking in just a column, they should take as input whole predictions dataframe and figure out which columns to use on their own (predictions, confidence, transform data to something else, etc.).
This also means applicability_to_targets would be removed because now everything will always be provided to all metrics.
See https://gitlab.datadrivendiscovery.org/MIT-LL/d3m_data_supply/issues/155 and https://gitlab.datadrivendiscovery.org/MIT-LL/d3m_data_supply/issues/160 and https://gitlab.datadrivendiscovery.org/MIT-LL/d3m_data_supply/issues/156 for more information.