Establishing the foundations for a data-centric AI approach for virtual drug screening through a systematic assessment of the properties of chemical data