Clearing source does not reset cursor (Box 2.0.0-4)
Calling clear() on a Box source doesn't reset the cursor used in harvests. In the log dump below, the value for cursor continues to increment after Box's internal database has been reset.
Code:
void resetDatabase() {
logger.info("--CLEARING BOX DATABASE--");
box.getSource(sourceName).clear();
logger.info("--FINISHED CLEARING DATABASE--");
}
Log dump:
...
2019-03-26 16:47:36.275 INFO - harvesting item with cursor {"cursor":"1520335629751162881"}
2019-03-26 16:47:36.287 INFO - harvesting item with cursor {"cursor":"1520335629941666169"}
2019-03-26 16:47:36.305 INFO - harvesting item with cursor {"cursor":"1520335630119227230"}
2019-03-26 16:47:36.316 INFO - --CLEARING BOX DATABASE--
2019-03-26 16:47:36.382 INFO - harvesting item with cursor {"cursor":"1520335630282232156"}
2019-03-26 16:47:36.438 INFO - harvesting item with cursor {"cursor":"1520335630474786224"}
2019-03-26 16:47:36.475 INFO - harvesting item with cursor {"cursor":"1520335630690383804"}
2019-03-26 16:47:36.488 INFO - harvesting item with cursor {"cursor":"1520335630863778258"}
2019-03-26 16:47:36.517 INFO - --FINISHED CLEARING DATABASE--
2019-03-26 16:47:36.519 INFO - harvesting item with cursor {"cursor":"1520335631130648833"}
2019-03-26 16:47:36.535 INFO - harvesting item with cursor {"cursor":"1520335631498461471"}
2019-03-26 16:47:36.554 INFO - harvesting item with cursor {"cursor":"1520335631705357265"}
...
My harvest is using a custom View object to load the documents into Solr, and it doesn't store any documents in Box.
public class ItemViewHarvester extends View {
private SolrDao solr;
public ItemViewHarvester setSolrDao(SolrDao solrDao) {
this.solr = solrDao;
return this;
}
@Override
protected List<BoxDocument> transform(List<BoxDocument> documents) {
// upload documents to Solr
solr.addBoxDocuments(documents);
// don't add documents to Box's internal database
return Collections.emptyList();
}
}
And my box configuration:
box:
sources:
item:
harvest:
type: ItemViewHarvester
params:
uri: "sourceURL"
db:
type: edu.byu.hbll.box.impl.MongoDatabase
params:
database: myapp
Edited by Josh Cooper