notes with tags:
dagops
id |
text |
url |
tags |
updated_at |
4862 |
dagops: some restart logic. you need asset-based pipelines (like in dagster or makefile) instead of task-runs-based.
probably you need to add is_done_command/exists_command argument to TaskInfo.
which is also runs using subprocess. Also add status CACHE_CHECKING, CACHE_EXISTS, CACHE_NOT_EXISTS, CACHE_CHECK_FAILED
differentiate between asset and task. A task should yield assets. |
|
dagops
|
2023 Aug 12 17:58 |
4838 |
fix negative duration of dagops tasks
|
|
libmv
dagops
|
2023 Aug 12 12:38 |
4622 |
update cancel_orphans method with new states
|
|
dagops
issues
|
2023 Jul 16 09:20 |
4621 |
Add exists cache
- [ ] test_lock
- [ ] test path unlocked when task failed or succeeded |
|
dagops
issues
|
2023 Jul 16 09:20 |
4620 |
simplify create_dag, prepare_dag, validate_dag methods (move all logic to pydantic) |
|
dagops
issues
|
2023 Jul 16 09:20 |
4619 |
use async sqlalchemy 2.0 methods
|
|
dagops
issues
|
2023 Jul 16 09:20 |
4618 |
Cache check should be on daemon side. Worker should only run processes and have queued,running, success/failed states |
|
dagops
issues
|
2023 Jul 16 09:19 |
4617 |
reduce number of maxtasks (32 -> 8) in CI |
|
dagops
issues
|
2023 Jul 16 09:19 |
4616 |
delete id=uuid.uuid4(), from models.Task( (use default factory)
|
|
dagops
issues
|
2023 Jul 16 09:19 |
4615 |
dont create Daemon objects, run prepare_workers in main.py files. Use some declarative approach instead.
Automatically cancel_orphans, run prepare_workers etc
DaemonSet.run() /DaemonGroup.run()
DaemonRunner |
|
dagops
issues
|
2023 Jul 16 09:18 |
4614 |
Don't create new task for cache in daemon
the daemon task is the same, just changing it states and send signal to worker. the worker is creating a new process and own new fsm task. but no need to re ew daemon fsm |
|
dagops
issues
|
2023 Jul 16 09:18 |
4613 |
Dont store tasks state. Or remove after restart. (Idempodency). It should be like Makefile.`exists` checks after each run
Можно в цикле while True перепроверять все файлы в watch_directory. И чекать что все exists (включая зависимости) короче даг. Но можно сохранять стейт как ща и давать возможность рестарта
возможно нужно хранить в redis status = DONE.
Check state each time task have dependency.
Maybe store asset path (redis/filesystem) in the table of task. (or 1 task can yield multiple assets).
Yeah, a task can yield multiple assets. (eg gpu batch pitch task).
But task always depends on fixed list of asset paths:
- if Asset paths - task is not executed
- if task depends on phony task/target/asset, it which will be executed always (cant be cached)
|
|
dagops
issues
|
2023 Jul 16 09:17 |
4153 |
Design overview - redun |
https://insitro.github.io/redun/design.html# |
python
dagops
|
2023 Mar 12 18:00 |
4152 |
Launch HN: DAGWorks – ML platform for data science teams | Hacker News |
https://news.ycombinator.com/item?id=35056903 |
python
dagops
|
2023 Mar 12 17:57 |
4151 |
pditommaso/awesome-pipeline: A curated list of awesome pipeline toolkits inspired by Awesome Sysadmin |
https://github.com/pditommaso/awesome-pipeline |
python
dagops
|
2023 Mar 12 17:57 |
4109 |
add mypy and pyright |
|
python
dagops
|
2023 Mar 06 20:34 |
4066 |
# state machines
- [pytransitions/transitions: A lightweight, object-oriented finite state machine implementation in Python with many extensions](https://github.com/pytransitions/transitions)
- [transitions critique: | Hacker News](https://news.ycombinator.com/item?id=14636947)
- [When Booleans Are Not Enough... State Machines? - YouTube](https://www.youtube.com/watch?v=I1Mzx_tSpew)
- basically when your state is more than 2 booleans - there's a good chance state machine will make things simpler
- [Designing state machines | Hacker News](https://news.ycombinator.com/item?id=14634947)
- [Pycon Ireland 2018: Finite State Machines in Python - Brian Stempin - YouTube](https://www.youtube.com/watch?v=1WIrc6b6Avc)
- [Glyph Lefkowitz - Automat - Pyninsula #0 - YouTube](https://www.youtube.com/watch?v=0wOZBpD1VVk)
- [State Machines - YouTube](https://www.youtube.com/watch?v=TedUKXhu9kE)
- [glyph/automat: Self-service finite-state machines for the programmer on the go.](https://github.com/glyph/Automat)
- [The Python Podcast.__init__: Automat State Machines with Glyph Lefkowitz](https://www.pythonpodcast.com/automat-state-machines-with-glyph-lefkowitz-episode-116/)
- [Statecharts.pdf](https://www.wisdom.weizmann.ac.il/~harel/papers/Statecharts.pdf) |
|
coding
algorithms
dagops
python
|
2023 Mar 06 17:26 |
4104 |
automatically create workers when new workers are specified in dags. ege my_worker_42 |
|
dagops
|
2023 Mar 05 16:16 |
4103 |
try to communicate with worker via watching db tasks table (while True on both sides)
instead of redis messages
(single storage for state) |
|
dagops
|
2023 Mar 05 13:28 |
4102 |
use DI for redis in a way that allows to use FakeRedis in tests |
|
dagops
|
2023 Mar 05 10:13 |
4101 |
replace `WorkerTaskStatus` states back to `TaskStatus` with prefix `WORKER_` |
|
dagops
|
2023 Mar 04 09:45 |
4077 |
try SQLModel |
https://sqlmodel.tiangolo.com/features/ |
python
dagops
|
2023 Feb 26 19:07 |
3936 |
dag можно через routing в разные очереди сделать. Хотя через pub/sub наверное проше |
https://youtu.be/IwW4yOn_2yc?t=1558 |
dagops
|
2023 Feb 03 16:30 |
3927 |
use json structured log |
https://youtu.be/YA42xVgfrFE?t=611 |
dagops
|
2023 Feb 03 12:38 |
3915 |
write `isdone` / `restart` logic. Dont start task if resource exists |
|
dagops
|
2023 Feb 02 21:52 |
3866 |
try use redis for flat db like
```
status:cc8fddb622b84fd79d8dae3f7f4fba31 - SUCCESS
upstream:cc8fddb622b84fd79d8dae3f7f4fba31 - [622b84, 79d8da, f4fba31]
``` |
|
dagops
|
2023 Jan 22 09:24 |
3849 |
generate graph image using graphviz |
|
dagops
|
2023 Jan 15 14:13 |