https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ#Head
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ
http://www.nanopub.org/nschema#hasAssertion
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ#assertion
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ
http://www.nanopub.org/nschema#hasProvenance
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ#provenance
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ
http://www.nanopub.org/nschema#hasPublicationInfo
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ#pubinfo
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://www.nanopub.org/nschema#Nanopublication
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ#assertion
http://id.crossref.org/issn/2451-8492
http://purl.org/dc/terms/title
Data Science
https://doi.org/10.3233/DS-240059
http://purl.org/dc/terms/abstract
Measuring data drift is essential in machine learning applications where model scoring (evaluation) is done on data samples that differ from those used in training. The Kullback-Leibler divergence is a common measure of shifted probability distributions, for which discretized versions are invented to deal with binned or categorical data. We present the Unstable Population Indicator, a robust, flexible and numerically stable, discretized implementation of Jeffrey's divergence, along with an implementation in a Python package that can deal with continuous, discrete, ordinal and nominal data in a variety of popular data types. We show the numerical and statistical properties in controlled experiments. It is not advised to employ a common cut-off to distinguish stable from unstable populations, but rather to let that cut-off depend on the use case.
https://doi.org/10.3233/DS-240059
http://purl.org/dc/terms/date
2024-06-26
https://doi.org/10.3233/DS-240059
http://purl.org/dc/terms/hasPart
https://w3id.org/kpxl/ios/ds/np/RA0XRooQKz2A7aoP0VJLS2NKcvQv-n7RwPoYtcD4wtTPc
https://doi.org/10.3233/DS-240059
http://purl.org/dc/terms/isPartOf
http://id.crossref.org/issn/2451-8492
https://doi.org/10.3233/DS-240059
http://purl.org/dc/terms/title
Measuring Data Drift with the Unstable Population Indicator
https://doi.org/10.3233/DS-240059
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://purl.org/spar/fabio/ResourcePaper
https://orcid.org/0000-0003-2581-8370
http://schema.org/affiliation
https://ror.org/04dkp9463
https://orcid.org/0000-0003-2581-8370
http://schema.org/affiliation
https://ror.org/05xvt9f17
https://orcid.org/0000-0003-2581-8370
http://schema.org/email
datascience@marcelhaas.com
https://orcid.org/0000-0003-2581-8370
http://xmlns.com/foaf/0.1/name
Marcel R. Haas
https://orcid.org/0009-0003-5030-0108
http://schema.org/affiliation
https://ror.org/04b8v1s79
https://orcid.org/0009-0003-5030-0108
http://schema.org/affiliation
https://ror.org/04dkp9463
https://orcid.org/0009-0003-5030-0108
http://schema.org/email
L.Sibbald@tilburguniversity.edu
https://orcid.org/0009-0003-5030-0108
http://xmlns.com/foaf/0.1/name
Lisette Sibbald
https://ror.org/04b8v1s79
http://xmlns.com/foaf/0.1/name
Department of Methodology and Statistics and Department of Cognitive Neuropsychology, Tilburg University, Prof. Cobbenhagenlaan 125, 5037 DB Tilburg, The Netherlands
https://ror.org/04dkp9463
http://xmlns.com/foaf/0.1/name
Business Intelligence, University of Amsterdam, Spui 21, 1012WX Amsterdam, The Netherlands
https://ror.org/05xvt9f17
http://xmlns.com/foaf/0.1/name
Public Health and Primary Care, Leiden University Medical Center, Albinusdreef 2, The Netherlands
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ#author-list
http://www.w3.org/1999/02/22-rdf-syntax-ns#_1
https://orcid.org/0000-0003-2581-8370
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ#author-list__1
http://www.w3.org/1999/02/22-rdf-syntax-ns#_2
https://orcid.org/0009-0003-5030-0108
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ#provenance
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ#assertion
http://www.w3.org/ns/prov#wasAttributedTo
https://orcid.org/0000-0003-2581-8370
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ#assertion
http://www.w3.org/ns/prov#wasAttributedTo
https://orcid.org/0009-0003-5030-0108
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ#pubinfo
https://orcid.org/0000-0002-1267-0234
http://xmlns.com/foaf/0.1/name
Tobias Kuhn
https://orcid.org/0000-0003-2581-8370
http://xmlns.com/foaf/0.1/name
Marcel R. Haas
https://orcid.org/0009-0003-5030-0108
http://xmlns.com/foaf/0.1/name
Lisette Sibbald
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ#author-list
http://www.w3.org/1999/02/22-rdf-syntax-ns#_1
https://orcid.org/0000-0003-2581-8370
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ#author-list
http://www.w3.org/1999/02/22-rdf-syntax-ns#_2
https://orcid.org/0009-0003-5030-0108
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ#sig
http://purl.org/nanopub/x/hasAlgorithm
RSA
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ#sig
http://purl.org/nanopub/x/hasPublicKey
MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCjDGQCS1S+SRnERDuYDXOugdYUP0efEquHJEEHAbU/uLzBVlga89zqrNPCS7fBE6lArBUWEmT8eLKdMapyqvAzI1J3jUWTMhDJF+XFBkUiuiFfNSc4vJJcmi0yujtnuzXsRIG202jyaP4f5ULoskFwaZOSBZJfiE0dsB3D7DTIAQIDAQAB
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ#sig
http://purl.org/nanopub/x/hasSignature
Ox+5X6nHLumNtHd4Ka2ICEWhUX+v6KVWn4UKDEEAixySaGj9TJt/mBFpssxtxcrM29g070GCs1SakxQ2Re3c6lUEEkHh/E4MLDc9ReR2vZoLi2oUzJfKzWC+WuTjML12q88gZUw9uoWThRpPW+j4XOn8dUrPk8DffrF/R1+Hrg8=
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ#sig
http://purl.org/nanopub/x/hasSignatureTarget
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ#sig
http://purl.org/nanopub/x/signedBy
https://orcid.org/0000-0002-1267-0234
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ
http://purl.org/dc/terms/created
2024-07-12T09:07:29.273Z
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ
http://purl.org/dc/terms/creator
https://orcid.org/0000-0002-1267-0234
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ
http://purl.org/dc/terms/isPartOf
https://doi.org/10.3233/DS-240059
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ
http://purl.org/dc/terms/license
https://creativecommons.org/licenses/by/4.0/
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ
http://purl.org/nanopub/x/hasNanopubType
http://purl.org/spar/fabio/ScholarlyWork
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ
http://purl.org/nanopub/x/hasNanopubType
https://w3id.org/kpxl/ios/ds/terms/DataScienceNanopub
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ
http://purl.org/nanopub/x/introduces
https://doi.org/10.3233/DS-240059
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ
http://purl.org/nanopub/x/supersedes
https://w3id.org/kpxl/ios/ds/np/RALO1noJ6z4w0bumoQuKpUVKT7HE_zagqAA8Qy4djeLg0
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ
http://purl.org/nanopub/x/wasCreatedAt
https://nanodash.petapico.org/
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ
http://purl.org/ontology/bibo/authorList
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ#author-list
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ
http://www.w3.org/2000/01/rdf-schema#label
Article: Measuring Data Drift with the Unstable Population Indicator
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ
https://w3id.org/np/o/ntemplate/wasCreatedFromProvenanceTemplate
http://purl.org/np/RAi6zZAwhaJ23Hzg4lIjlPir6Take3ZQp-lS9skfBEwfQ
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ
https://w3id.org/np/o/ntemplate/wasCreatedFromPubinfoTemplate
http://purl.org/np/RAA2MfqdBCzmz9yVWjKLXNbyfBNcwsMmOqcNUxkk1maIM
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ
https://w3id.org/np/o/ntemplate/wasCreatedFromPubinfoTemplate
http://purl.org/np/RAh1gm83JiG5M6kDxXhaYT1l49nCzyrckMvTzcPn-iv90
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ
https://w3id.org/np/o/ntemplate/wasCreatedFromPubinfoTemplate
http://purl.org/np/RAjpBMlw3owYhJUBo3DtsuDlXsNAJ8cnGeWAutDVjuAuI
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ
https://w3id.org/np/o/ntemplate/wasCreatedFromPubinfoTemplate
https://w3id.org/np/RA5R_qv3VsZIrDKd8Mr37x3HoKCsKkwN5tJVqgQsKhjTE
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ
https://w3id.org/np/o/ntemplate/wasCreatedFromPubinfoTemplate
https://w3id.org/np/RAIabr2sRVJ-YOIwZRD__BVMJKnq3QtQw_mjLIGSACPAI
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ
https://w3id.org/np/o/ntemplate/wasCreatedFromPubinfoTemplate
https://w3id.org/np/RA_JdI7pfDcyvEXLr_gper3h8egmNggeTqkJbyHrlMEdo
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ
https://w3id.org/np/o/ntemplate/wasCreatedFromPubinfoTemplate
https://w3id.org/np/RAoWx0AJvNw-WqkGgZO4k8udNCg6kMcGZARN3DgO_5TII
https://w3id.org/kpxl/ios/ds/np/RAn08NC9isAMOUJCNaGrh31KHTSet2xemhzqS9YNB49hQ
https://w3id.org/np/o/ntemplate/wasCreatedFromTemplate
https://w3id.org/np/RAeQJfX3lMDqtzyddnRmlBvxSoWohzEKzsaMKWrR8K6J0