-
Notifications
You must be signed in to change notification settings - Fork 71
Description
Describe the bug
DSpace-CRIS Version: 2024.02.00
Issue Description
Running the script update-metrics --service scopus does not add metrics for all publications inside CRIS.
In my instance, with 866 publications having either a DOI or a Scopus ID, only 442 publications get their metrics updated.
Steps to Reproduce
- Add a considerable number of publications (e.g., 100) with the metadata
dc.identifier.doiordc.identifier.scopus, corresponding to publications indexed in Scopus. Identifiers should be unique and the documents should have a citation number greater than 0. - Run the script
update-metrics --service scopus.
Expected Behavior
All publications with a valid identifier should have their metrics updated, respecting the --limit parameter of the script.
Related Work
Similarly to issue #508, there appears to be a problem with the item iterator and the committing of results.
In the function updateMetric of UpdateScopusMetrics.java, there is a while block starting at line 88 where the final action is to commit the obtained metrics of the item.
This seems to alter the item iterator, reducing the number of items that should be updated.
DSpace/dspace-api/src/main/java/org/dspace/metrics/scopus/UpdateScopusMetrics.java
Lines 81 to 122 in 68eeb03
| @Override | |
| public long updateMetric(Context context, Iterator<Item> itemIterator, String param) { | |
| long updatedItems = 0; | |
| long foundItems = 0; | |
| long apiCalls = 0; | |
| logsCache = new ArrayList<>(); | |
| try { | |
| while (itemIterator.hasNext()) { | |
| Map<String, Item> queryMap = new HashMap<>(); | |
| List<Item> itemList = new ArrayList<>(); | |
| for (int i = 0; i < fetchSize && itemIterator.hasNext(); i++) { | |
| Item item = itemIterator.next(); | |
| logAndCache("Adding item with uuid: " + item.getID()); | |
| setLastImportMetadataValue(context, item); | |
| itemList.add(item); | |
| } | |
| foundItems += itemList.size(); | |
| String id = this.generateQuery(queryMap, itemList); | |
| logAndCache("Getting scopus metrics for " + id); | |
| updatedItems += | |
| scopusProvider.getScopusList(this.generateQuery(queryMap, itemList)) | |
| .stream() | |
| .filter(Objects::nonNull) | |
| .map(scopusMetric -> this.updateScopusMetrics( | |
| context, | |
| this.findItem(queryMap, scopusMetric), | |
| scopusMetric | |
| ) | |
| ) | |
| .filter(BooleanUtils::isTrue) | |
| .count(); | |
| apiCalls++; | |
| context.commit(); | |
| } | |
| } catch (SQLException e) { | |
| logAndCacheError("Error while updating scopus' metrics", e); | |
| } finally { | |
| logAndCache("Found and fetched " + foundItems + " with " + apiCalls + " api calls!"); | |
| } | |
| logsCache.addAll(scopusProvider.getLogs()); | |
| return updatedItems; | |
| } |
I moved the commit call outside the while block on my instance an that seems to fix the problem.