query_builder

中文文档

与Scheduler相关的Mongodb Query.

crawlib.pipeline.mongodb.query_builder.finished(finished_status, update_interval, status_key, edit_at_key)

Create dict query for pymongo that getting all finished task.

Parameters:
  • finished_status – int, status code that greater or equal than this will be considered as finished.
  • update_interval – int, the record will be updated every x seconds.
  • status_key – status code field key, support dot notation.
  • edit_at_key – edit_at time field key, support dot notation.
Returns:

dict, a pymongo filter.

中文文档

状态码大于某个值, 并且, 更新时间在最近一段时间以内.

crawlib.pipeline.mongodb.query_builder.unfinished(finished_status, update_interval, status_key, edit_at_key)

Create dict query for pymongo that getting all unfinished task.

Parameters:
  • finished_status – int, status code that less than this will be considered as unfinished.
  • update_interval – int, the record will be updated every x seconds.
  • status_key – status code field key, support dot notation.
  • edit_at_key – edit_at time field key, support dot notation.
Returns:

dict, a pymongo filter.

中文文档

状态码小于某个值, 或者, 现在距离更新时间已经超过一定阈值.