the scenerio is:
when a new file is created in a folder, we want to do something with it.
we can use watchdog, example code is below:
import logging
import time
from watchdog.events import FileSystemEventHandler
from watchdog.observers import Observer
from app.tasks import push_file_to_volc
path = "/data4/products"
class MyHandler(FileSystemEventHandler):
"""my handler
"""
def on_closed(self, event):
filename = event.src_path.split("/")[-1]
if filename.startswith("main-") and filename.endswith(".tar"):
push_file_to_volc.delay(event.src_path)
if __name__ == "__main__":
logging.basicConfig(level=logging.INFO,
format='%(asctime)s - %(message)s',
datefmt='%Y-%m-%d %H:%M:%S')
event_handler = MyHandler()
observer = Observer()
observer.schedule(event_handler, path, recursive=False)
observer.start()
try:
while True:
time.sleep(1)
finally:
observer.stop()
observer.join()
One important thing to note is that I customized the on_closed
method, not the on_created
method.
The on_created
method is triggered when the file is initially created, but it doesn’t guarantee that the file has been completely written.
On the other hand, the on_closed
method is triggered when the file is closed, indicating that the file has been fully written.
If we rely on the on_created
method and the new file is large, there is a risk of ending up with an incomplete file when attempting to copy it to another location.