Skip to content

Script update-metrics using service scopus does not update all publications #511

@jorgeltd

Description

@jorgeltd

Describe the bug
DSpace-CRIS Version: 2024.02.00

Issue Description
Running the script update-metrics --service scopus does not add metrics for all publications inside CRIS.
In my instance, with 866 publications having either a DOI or a Scopus ID, only 442 publications get their metrics updated.

Steps to Reproduce

  1. Add a considerable number of publications (e.g., 100) with the metadata dc.identifier.doi or dc.identifier.scopus, corresponding to publications indexed in Scopus. Identifiers should be unique and the documents should have a citation number greater than 0.
  2. Run the script update-metrics --service scopus.

Expected Behavior
All publications with a valid identifier should have their metrics updated, respecting the --limit parameter of the script.

Related Work
Similarly to issue #508, there appears to be a problem with the item iterator and the committing of results.
In the function updateMetric of UpdateScopusMetrics.java, there is a while block starting at line 88 where the final action is to commit the obtained metrics of the item.
This seems to alter the item iterator, reducing the number of items that should be updated.

@Override
public long updateMetric(Context context, Iterator<Item> itemIterator, String param) {
long updatedItems = 0;
long foundItems = 0;
long apiCalls = 0;
logsCache = new ArrayList<>();
try {
while (itemIterator.hasNext()) {
Map<String, Item> queryMap = new HashMap<>();
List<Item> itemList = new ArrayList<>();
for (int i = 0; i < fetchSize && itemIterator.hasNext(); i++) {
Item item = itemIterator.next();
logAndCache("Adding item with uuid: " + item.getID());
setLastImportMetadataValue(context, item);
itemList.add(item);
}
foundItems += itemList.size();
String id = this.generateQuery(queryMap, itemList);
logAndCache("Getting scopus metrics for " + id);
updatedItems +=
scopusProvider.getScopusList(this.generateQuery(queryMap, itemList))
.stream()
.filter(Objects::nonNull)
.map(scopusMetric -> this.updateScopusMetrics(
context,
this.findItem(queryMap, scopusMetric),
scopusMetric
)
)
.filter(BooleanUtils::isTrue)
.count();
apiCalls++;
context.commit();
}
} catch (SQLException e) {
logAndCacheError("Error while updating scopus' metrics", e);
} finally {
logAndCache("Found and fetched " + foundItems + " with " + apiCalls + " api calls!");
}
logsCache.addAll(scopusProvider.getLogs());
return updatedItems;
}

I moved the commit call outside the while block on my instance an that seems to fix the problem.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions