notes with tags: dagops

id text url tags updated_at
4862 dagops: some restart logic. you need asset-based pipelines (like in dagster or makefile) instead of task-runs-based. probably you need to add is_done_command/exists_command argument to TaskInfo. which is also runs using subprocess. Also add status CACHE_CHECKING, CACHE_EXISTS, CACHE_NOT_EXISTS, CACHE_CHECK_FAILED differentiate between asset and task. A task should yield assets. dagops 2023 Aug 12 17:58
4838 fix negative duration of dagops tasks libmv dagops 2023 Aug 12 12:38
4622 update cancel_orphans method with new states dagops issues 2023 Jul 16 09:20
4621 Add exists cache - [ ] test_lock - [ ] test path unlocked when task failed or succeeded dagops issues 2023 Jul 16 09:20
4620 simplify create_dag, prepare_dag, validate_dag methods (move all logic to pydantic) dagops issues 2023 Jul 16 09:20
4619 use async sqlalchemy 2.0 methods dagops issues 2023 Jul 16 09:20
4618 Cache check should be on daemon side. Worker should only run processes and have queued,running, success/failed states dagops issues 2023 Jul 16 09:19
4617 reduce number of maxtasks (32 -> 8) in CI dagops issues 2023 Jul 16 09:19
4616 delete id=uuid.uuid4(), from models.Task( (use default factory) dagops issues 2023 Jul 16 09:19
4615 dont create Daemon objects, run prepare_workers in main.py files. Use some declarative approach instead. Automatically cancel_orphans, run prepare_workers etc DaemonSet.run() /DaemonGroup.run() DaemonRunner dagops issues 2023 Jul 16 09:18
4614 Don't create new task for cache in daemon the daemon task is the same, just changing it states and send signal to worker. the worker is creating a new process and own new fsm task. but no need to re ew daemon fsm dagops issues 2023 Jul 16 09:18
4613 Dont store tasks state. Or remove after restart. (Idempodency). It should be like Makefile.`exists` checks after each run Можно в цикле while True перепроверять все файлы в watch_directory. И чекать что все exists (включая зависимости) короче даг. Но можно сохранять стейт как ща и давать возможность рестарта возможно нужно хранить в redis status = DONE. Check state each time task have dependency. Maybe store asset path (redis/filesystem) in the table of task. (or 1 task can yield multiple assets). Yeah, a task can yield multiple assets. (eg gpu batch pitch task). But task always depends on fixed list of asset paths: - if Asset paths - task is not executed - if task depends on phony task/target/asset, it which will be executed always (cant be cached) dagops issues 2023 Jul 16 09:17
4153 Design overview - redun https://insitro.github.io/redun/design.html# python dagops 2023 Mar 12 18:00
4152 Launch HN: DAGWorks – ML platform for data science teams | Hacker News https://news.ycombinator.com/item?id=35056903 python dagops 2023 Mar 12 17:57
4151 pditommaso/awesome-pipeline: A curated list of awesome pipeline toolkits inspired by Awesome Sysadmin https://github.com/pditommaso/awesome-pipeline python dagops 2023 Mar 12 17:57
4109 add mypy and pyright python dagops 2023 Mar 06 20:34
4066 # state machines - [pytransitions/transitions: A lightweight, object-oriented finite state machine implementation in Python with many extensions](https://github.com/pytransitions/transitions) - [transitions critique: | Hacker News](https://news.ycombinator.com/item?id=14636947) - [When Booleans Are Not Enough... State Machines? - YouTube](https://www.youtube.com/watch?v=I1Mzx_tSpew) - basically when your state is more than 2 booleans - there's a good chance state machine will make things simpler - [Designing state machines | Hacker News](https://news.ycombinator.com/item?id=14634947) - [Pycon Ireland 2018: Finite State Machines in Python - Brian Stempin - YouTube](https://www.youtube.com/watch?v=1WIrc6b6Avc) - [Glyph Lefkowitz - Automat - Pyninsula #0 - YouTube](https://www.youtube.com/watch?v=0wOZBpD1VVk) - [State Machines - YouTube](https://www.youtube.com/watch?v=TedUKXhu9kE) - [glyph/automat: Self-service finite-state machines for the programmer on the go.](https://github.com/glyph/Automat) - [The Python Podcast.__init__: Automat State Machines with Glyph Lefkowitz](https://www.pythonpodcast.com/automat-state-machines-with-glyph-lefkowitz-episode-116/) - [Statecharts.pdf](https://www.wisdom.weizmann.ac.il/~harel/papers/Statecharts.pdf) coding algorithms dagops python 2023 Mar 06 17:26
4104 automatically create workers when new workers are specified in dags. ege my_worker_42 dagops 2023 Mar 05 16:16
4103 try to communicate with worker via watching db tasks table (while True on both sides) instead of redis messages (single storage for state) dagops 2023 Mar 05 13:28
4102 use DI for redis in a way that allows to use FakeRedis in tests dagops 2023 Mar 05 10:13
4101 replace `WorkerTaskStatus` states back to `TaskStatus` with prefix `WORKER_` dagops 2023 Mar 04 09:45
4077 try SQLModel https://sqlmodel.tiangolo.com/features/ python dagops 2023 Feb 26 19:07
3936 dag можно через routing в разные очереди сделать. Хотя через pub/sub наверное проше https://youtu.be/IwW4yOn_2yc?t=1558 dagops 2023 Feb 03 16:30
3927 use json structured log https://youtu.be/YA42xVgfrFE?t=611 dagops 2023 Feb 03 12:38
3915 write `isdone` / `restart` logic. Dont start task if resource exists dagops 2023 Feb 02 21:52
3866 try use redis for flat db like ``` status:cc8fddb622b84fd79d8dae3f7f4fba31 - SUCCESS upstream:cc8fddb622b84fd79d8dae3f7f4fba31 - [622b84, 79d8da, f4fba31] ``` dagops 2023 Jan 22 09:24
3849 generate graph image using graphviz dagops 2023 Jan 15 14:13