123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394 |
- .TH REALTIME 3
- .EQ
- delim $$
- .EN
- .SH NAME
- realtime \- real time scheduling
- .SH SYNOPSIS
- .EX
- .ta 4n +11n +7
- .B bind -a #R /dev
- .ift .sp 0.5
- .ifn .sp
- /dev/realtime/clone
- /dev/realtime/debug
- /dev/realtime/log
- /dev/realtime/nblog
- /dev/realtime/resources
- /dev/realtime/task/0
- /dev/realtime/task/1
- /dev/realtime/task/...
- /dev/realtime/time
- .sp
- #include "realtime.h"
- .ift .sp 0.5
- .ifn .sp
- enum SEvent {
- SAdmit, /* new proc arrives*/
- SRelease, /* released, but not yet scheduled */
- SRun, /* one of this task's procs started running */
- SPreempt, /* the running task was preempted */
- SBlock, /* the last of the runnable procs in a task blocked */
- SResume, /* task was scheduled after blocking */
- SDeadline, /* proc's deadline (reported ahead of time) */
- SYield, /* proc reached voluntary early deadline */
- SSlice, /* slice exhausted */
- SExpel, /* proc was expelled */
- SResack, /* acquire resource */
- SResrel, /* release resource */
- };
- typedef enum SEvent SEvent;
- typedef vlong Time;
- typedef struct Schedevent Schedevent;
- .ift .sp 0.5
- .ifn .sp
- struct Schedevent {
- ushort tid; /* Task ID */
- SEvent etype; /* Event type */
- Time ts; /* Event time */
- };
- .EE
- .SH DESCRIPTION
- The Plan 9 real-time scheduler allows mixing real-time processes and best-effort
- processes on a single system. The scheduler is intended for real-time loads well
- under 100% CPU utilization with minimum periods in the order of a hundred thousand
- instruction times. The precision of the scheduler depends on the precision of the
- machine's programmable real-time clock.
- .PP
- The unit of real-time scheduling is a
- .BR task .
- A task can contain zero or more processes. The processes in a task
- share a common period, deadline and CPU allocation. Tasks allow
- cooperating processes to collaborate to achieve a common deadline, for
- instance, processes receiving, decompressing and displaying video in a
- pipeline can share a single task.
- .PP
- Tasks are scheduled using Earliest-Deadline First (EDF). Each task
- has three primary scheduling parameters, a period $T$, a deadline $D$
- and a cost $C$, expressed in nanoseconds (but converted internally to
- a machine- and architecture-dependent unit called
- .IR tick ).
- Every $T$ nanoseconds, the task is
- .I released
- — i.e., it becomes schedulable — and it received a
- .I slice
- of $C$ nsec.
- When the task is released, its
- .I "release time"
- $r$ is set to the current time.
- The task's
- .I "absolute deadline"
- $d$ is set to $r + D$.
- (Note the use of capital letters to indicate intervals and lower-case
- letters to indicate `points in time'.)
- Between release and deadline the task must be scheduled often enough
- to be able to consume its slice of $C$ nsec of CPU time.
- So, to be schedulable, $C <= D <= T$ must hold.
- If $D < T$, tasks are not schedulable (runnable) between their
- deadline and the next release.
- Tasks are
- .I released
- from the moment of release until their deadline or their slice runs
- out, whichever happens first. Additionally, tasks can voluntarily
- declare the slice for the current period empty, allowing other
- real-time tasks or best-effort tasks the CPU time remaining in their
- slice.
- .PP
- Before they are released for the first time, tasks have to be
- .IR admitted
- into the system. Before admission, a test is conducted to determine
- whether the task, plus the already admitted tasks can all meet their
- deadlines, given their scheduling parameters. When a task changes its
- scheduling parameters, it is temporarily
- .I expelled
- and will be readmitted if the new scheduling parameters allow all
- tasks to continue to meet their deadlines.
- .PP
- The EDF scheduler schedules the released task with the earliest
- deadline. When a task is released it can, therefore, preempt a task
- that has a later deadline, but was released earlier.
- .PP
- Real-time tasks sharing resources, however, may need to be prevented
- from preempting each other. They can do this by declaring a shared
- resource. The scheduler will not preempt one task with another that
- shares a common resource. To do this, the scheduler needs to know the
- names of the sharable resources used by each of the tasks, the amount
- of time the resources are used, and the way in which resource usage is
- nested.
- .PP
- Resource usage must be strictly nested; i.e., it is legal for a task to
- first to acquire resource $A$, then to acquire resource $B$, to
- release $B$ again and finally to release $A$. It is also legal for it
- to acquire and release $A$ before acquiring and releasing $B$. But
- it is not legal to acquire $A$, acquire $B$, release $A$ and release $B$
- in that order.
- .PP
- During the admission test, information about resource sharing is used
- to calculate secondary deadlines for each task, called
- .IR "inherited deadlines" ,
- $Δ$, which is the minimum of the deadlines $D$ of each of the tasks
- that shares a resource currently held by the current task.
- .PP
- Acquiring a resource will lower the $Δ$ of the task (or keep it the
- same); releasing one will increase the $Δ$ (or keep it the same).
- .PP
- The scheduler allows a released task $T$ to preempt running task $T'$
- iff $d < d' ^ D < Δ'$; the first half of the condition says that the
- released task's deadline must be earlier, the second implies that the
- released task may not share resources with the running one (after all,
- if $T'$ had acquired a resource that $T$ uses, its $Δ'$ would, by the
- definition of inherited deadlines, be less than or equal to $D$).
- .PP
- The admission test takes these limitations into account. Note that,
- in practice, tasks rarely share resources.
- .PP
- Normally, tasks can block (when all their processes are blocked on
- system calls) during their release. The time spent blocking is not
- accounted against the current slice, but, since the scheduler has no
- control over the time spent blocking, there are no guarantees that the
- deadline will be made if a task blocks during its release. Whether or
- not tasks actually block, they will normally complete the work to be
- done in a period (slightly) before the slice runs out — and before
- the deadline. To tell the scheduler that they no longer require the
- CPU until the next release, tasks can
- .I yield
- (see below).
- .PP
- Tasks can also run in another mode where blocking automatically
- implies immediate yielding. This mode truly guarantees that deadlines
- will be made, but programmers must know more about the (blocking)
- nature of the system calls used in order to use this mode.
- .sp
- .LP
- The real-time scheduler is controlled through the
- .B #R
- file system, normally bound to
- .BR /dev .
- Opening
- .B /dev/realtime/clone
- for writing (and reading, if desired), causes a new task to be created in the
- non-admitted (expelled) state. The new task will be represented by a file in
- the
- .B /dev/realtime/task
- directory whose name $n$ is the task's number.
- The file descriptor resulting from opening
- .B /dev/realtime/clone
- provides the equivalent functionality as that resulting from opening
- .BI /dev/realtime/task/ n
- except for the initial creation of a new task when opening the
- .B clone
- file. The task file may also be opened read-only.
- .PP
- The task's parameters are set or modified by writing one of these files and they
- can be examined by reading it.
- A parameter is presented in the form
- \f2name\fP=\f2value\fP, \f2name\fP+=\f2value\fP, or \f2name\fP-=\f2value\fP.
- A command is presented in the form
- .IR commandname .
- Parameters and commands are separated by white space; quoting conventions apply
- (see
- .IR quote (2)).
- The settable parameters are
- .TP
- .B T
- Sets the period. The value must be given as a number with or without decimal point
- and directly followed (no white space) by one of
- .I s
- for seconds,
- .I ms
- for milliseconds,
- .I µs
- or
- .I us
- for microseconds,
- .I ns
- or
- nothing
- for nanoseconds. The
- .B +=
- operator adds to the period already set, the
- .B -=
- operator subtracts from it.
- .TP
- .B D
- Sets the deadline. Value syntax and operators as above.
- .TP
- .B C
- Sets the cost. Value syntax and operators as above.
- .TP
- .B resources
- Sets the list of resources; resource names must consist of letters, digits, or underscores
- and they must not start with a digit. A resource name may be followed by its cost
- (using the same notation of time as in setting the $T$ attribute) and by resources
- nested within. These nested resources are indicated by an open brace '{', a list of
- nested resources and a closing brace '}'. Quoting is usually necessary.
- An example of a resource specification is
- .ti +2m
- .B "resources='a 10ms { b 2ms c 3ms {d} }'
- .br
- Here, resource $a$ is held for at most 10 ms; during this time, resource $b$ is
- acquired and then released within 2 ms. After releasing $b$, resource $c$ can
- be acquired and then resource $d$. No time is specified for $d$, so its cost
- is inherited from its parent: 3 ms. Resources $d$, $c$ and $a$ must be released
- in that order.
- .br
- .ti +1m
- The
- .B +=
- and
- .B -=
- operators are not allowed for resources (they must be specified completely every time).
- .TP
- .B acquire
- is a run time command which tells the scheduler that the named resource is being acquired.
- The scheduler will adjust the task's $Δ$ accordingly. It is illegal to acquire a resource
- not specified or not to follow the nesting order. It is legal, however, not to acquire a resource
- (and all the ones nested inside) that may legally be acquired.
- The
- .B +=
- and
- .B -=
- operators do not apply.
- .TP
- .B procs
- Sets, adds to, or subtracts from the process membership of a task. Processes
- are identified by their
- .I pid
- or the special value
- .B self
- identifying the process writing the
- .I clone
- or
- .I task
- file.
- .TP
- .B yieldonblock
- When the (integer) value is non-zero, the task is placed in
- .I yield-on-block
- mode: when all processes in the task are blocked on system calls, the task automatically
- declares the current deadline reached and will be released again in the next period.
- When the value is zero, the task can block without forfeiting the current release.
- However, when the task blocks long, the deadline may be missed. The default value is zero.
- .LP
- In addition to these settable attributes, there are a number of attributes
- to be read:
- .TP
- .B n
- Number of releases of this task,
- .TP
- .B m
- Number of deadlines missed,
- .TP
- .B p
- Number of times the task was preempted by one with an earlier deadline,
- .TP
- .B t
- Total time used (divide this by
- .I n
- to get the average time used per period),
- .TP
- .B c
- Aged average time used,
- .LP
- The commands accepted are
- .TP
- .B verbose
- This is for debugging and causes the progression of the admission test to be displayed.
- .TP
- .B admit
- This initiates an admission test. If the test fails, the write containing the
- .B admit
- command will fail. If the test succeeds the task is admitted and, if it contains
- runnable processes, it will be released (almost) immediately.
- .TP
- .B expel
- Expel the task.
- .TP
- .B remove
- Remove the task, the file descriptor will become invalid.
- .TP
- .B release
- Release the last resource acquired. There is no attribute.
- .TP
- .B yield
- Gives up the processor until the next release.
- .LP
- .sp
- .B /dev/realtime/debug
- prints debugging information about the task queues maintained by the scheduler.
- This file will go away.
- .PP
- .B /dev/realtime/log
- and
- .B /dev/realtime/nblog
- are files used by real-time monitoring processes such as
- .B rtstats
- (see
- .IR rtstats (1)).
- Reading them produces a stream of
- .B Schedevent
- structures in the machine representation. These events are
- nearly ordered by Time (nanoseconds since boot); some events can
- be reported earlier or later than they occur.
- .I Tid
- corresponds to the file name in the directory
- .BR /dev/realtime/task ,
- .I etype
- is one of the events of
- .I SEvent
- as explained in
- .BR /sys/src/9/port/devrealtime.h ,
- and
- .I ts
- is the time at which the event occurred (or will occur).
- .B Nblog
- is a non-blocking version that returns zero bytes each time there is
- nothing left to read.
- .PP
- .B /dev/realtime/resources
- is a read-only file listing the resources declared by the current set of tasks.
- .PP
- .B /dev/realtime/time
- returns the number of nanoseconds since boot, the number of ticks since boot and
- .IR fasthz ,
- the number of clock ticks per second, a billion times the ratio between the other two numbers.
- Each number is returned as a binary
- .I vlong
- in architecture-dependent representation.
- .SH EXAMPLE
- .EX
- .ta 4n +4n +4n +4n +4n +4n +4n
- char *clonedev = "#R/realtime/clone";
- void
- processvideo(){
- int fd;
- if ((fd = open(clonedev, ORDWR)) < 0)
- sysfatal("%s: %r", clonedev);
- if (fprint(fd, "T=33ms D=20ms C=8ms procs=self admit") < 0)
- sysfatal("%s: admit: %r", clonedev);
- for (;;){
- processframe();
- if (fprint(fd, "yield") < 0)
- sysfatal("%s: yield: %r", clonedev);
- }
- if (fprint(fd, "remove") < 0)
- sysfatal("%s: remove: %r", clonedev);
- close(fd);
- }
- .EE
- .SH "SEE ALSO
- .IR rtstats (1)
- .SH FILES
- .B /sys/src/cmd/rtstats/edfproc.c
- as an example of using the scheduler for a trivial test run.
- .B /sys/src/cmd/rtstats/resproc.c
- as an example of using resources.
- .SH SOURCE
- .B /sys/src/9/port/devrealtime.c
- .br
- .B /sys/src/9/port/edf.c
- .SH BUGS
- The admission test does not take multiprocessors into account. The total
- real-time load cannot exceed a single-\s-2CPU\s0 load.
|