Our primary goal is to ensure that bibliographies (list of publications) of authors in dblp are correct. This means that all publications of a person should be listed in the same list and that a list should contain only publications from one specific person. It can be difficult to ensure this and despite our best efforts, we assign publications to the wrong publication list. Because of this, we frequently check our data set and correct mistakes. The following figure shows the number of corrections we made in the last twenty years.
In a merge correction, two (or more) publication lists are merged. E.g., we discover that A. Jones and Adam Jones are the same person. A split fixes a defect where multiple authors listed on the same publication list. E.g., Adam Jones is split into Adam Jones 0001 and Adam Jones 0002 (see our FAQ for more details). A distribute is a correction where we move publications from one publication list to another. A single correction can affect a large number of publications. I.e., merging two large bibliographies counts as one correction. Each correction is hand checked.
As you can see, the number of corrections has increased in recent years with a peak in 2019. There are multiple reasons for this:
- Our data set becomes larger. That means: more room for mistakes, but also more data to detect problematic cases automatically.
- Because of our extended federal and state funding, we have more resources to look at potential defects.
- ORCID! We now use ORCID data provided with publications as a central tool in tracing potential errors. An ORCID uniquely identifies an author and makes person disambiguation much easier. If you do not have one please register with ORCID (for free) and make sure your ORCID is associated with your publications (many publishers make this possible now).
We still have large lists of potential errors, so 2020 should be another productive year :-). If you encounter any error in dblp, please let us know (see our FAQ for details).