[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: i/o speed (was: pipes & ptys)



itschere@TechFak.Uni-Bielefeld.DE writes:
> Juergen Lock
> 
> > > Ok, got this, but: (see my other mail) it will return as soon as ther is
> > > some data, say, when it get's its next time slice, which is 16ms long on
> > > my 60Hz TT-VBL, right? Now I estimate that the writer (if really using 1
> > > bytes I/O), doesn't manage to output more than something like 30 chars per
> > > timeslice, so 60 slices a 30 bytes (always completely ignoring the time
> > > I need to read the data),~
> >
> >  this is the time we're trying to improve...
> 
> feel free to do so... :-)
> 
> > > makes 1800 CPS, and that looks like the number I've measured.
> >
> >  how much cps does the writer get when you send it to /dev/null
> > instead?  the difference between that and your 1800 is the only thing
> > we can improve here of course... for more the writing program has to
> > use longer writes.
> 
> naturally more, don't know exactly, but sure not 10 times as much... :-(
> 
> to summarize it:
> 
> if a program chooses to output one char, say, through a piped pseudo
> terminal, it looks more or less like this:
> 
> Cconout('!') ->
> Fwrite(?, 1, "!") ->
> tty_write(...) ->
> tty_putchar() ->
> pipe_write().

 (actually Cconout doesn't go thru Fwrite but its still slow.)
> 
> and the reader get's:
> 
> Fread() ->
> tty_read() ->
> tty_getchar() ->
> pipe_read().

 ...only the reader can be improved.

 Fread() ->
 tty_read() ->
 one pty_bread().
> 
> and in the meantime, half an hour is gone... ;-(
> 
> > > If a) you really use big chunks and b) modems are operated by interrupt...
> > 
> >  yup!  actually modems are operated by interrupts already, only
> > the buffer status is still polled and data is read/written
> > one-byte-at-a-time...
> 
> to be more accurate, only incoming bytes raise an interrupt, but for outgoing
> the i/o chip is polled to see if it's ready. perhaps not immediately after the
> byte was send away, but before a char is to be send.

 thats true for the midi port and keyboard controller but...

>  at least the MFP does also
> support an interrupt for "output-buffer-empty", the newer chips also, I think.

 (any serial port chip has that :)

> but as far as I know, these are not used so far.

 ...but the modem port(s) do use send buffers.  not that ataris BIOS
modem drivers are known to be fast and bug-free but they do use the rx
and tx interrupts...
> 
> by using these and a new buffer managing, completely obsoleting the bios,
> you'll gain lots of speed, I presume... :-)

 yup.  and if the device can get around all the char <-> long `conversions'
etc so that a Fwrite can end up as just a few movem.l (bcopy...) you get
even more speed.  this is the write() replacement i have in uucp:

#if MINT_FASTWRITE
/* here is fastwrite()...  comments like on fastread() apply. */
int fastwrite (fd, buf, cwrite, qs)
int fd;
char *buf;
unsigned cwrite;
struct ssysdep_serial_port *qs;
{
  char *p = buf;
  int slept = 0;

  /* /dev/midi doesn't have a write buffer :-( */
  if (!S_iorec || S_biosdev == 3)
    return write (fd, buf, cwrite);

  /* if we should watch CD and its not there flush buffers and return 0. */
  if (buf && S_fwatchcd && !pollcd (S_biosdev)) {
#if MINT_FASTREAD
    (void) fastread (fd, (char *) NULL, 0, qs);
#endif
    buf = (char *) NULL;
  }

  if (!buf) {
    long stack = Super (0L);

    /* flush send buffer...  should be safe to just set the tail pointer. */
    S_iorec[1].ibuftl = S_iorec[1].ibufhd;
    (void) Super (stack);
    return 0;
  }

  if (!cwrite)
    /* nothing to do... */
    return 0;

  do {
    char ch;
    unsigned short tail, bsize, wrap, newtail;
    long free, stack;

    /* enter supervisor mode... */
    stack = Super (0L);

    tail = S_iorec[1].ibuftl;
    bsize = S_iorec[1].ibufsiz;
    if ((free = S_iorec[1].ibufhd - tail - 1) < 0)
      free += bsize;

    /* if buffer is full or we're blocking and can't write enuf */
    if (!free || (qs->fread_blocking && free < cwrite && free < bsize/2)) {
      long sleepchars;
      unsigned isleep;

      /* leave super mode. */
      (void) Super (stack);

      /* if the write should not block thats it. */
      if (!qs->fread_blocking)
	return p - buf;

      /* else sleep the (minimum) time it takes until the buffer is
	 either half-empty or has space enough for the write, wichever
	 is smaller. */
      if ((sleepchars = bsize/2) > cwrite)
	sleepchars = cwrite;
      sleepchars -= free;

      isleep = (unsigned) ((sleepchars * 10000L) / qs->ibaud);

      /* except that if we already slept and the buffer still was full we
	 sleep for at least 20 milliseconds. (driver must be waiting for
	 some handshake signal and we don't want to hog the processor.) */
      if (slept && isleep < 20)
	isleep = 20;

      if (isleep < 5)
	/* if it still would be less than 5 milliseconds then just
	   give up this timeslice */
	(void) Syield();
      else
	usleep ((unsigned long) isleep * 1000);

      /* loop and try again. */
      slept = !free;
      continue;
    }
    slept = 0;

    /* save the 1st char, we could need it later. */
    ch = *p;
    wrap = bsize - tail;
    if (free > cwrite)
      free = cwrite;
    cwrite -= free;

    /* now copy to buffer.  if its just a few then do it here... */
    if (free < 5) {
      char *q = S_iorec[1].ibuf + tail;

      while (free--) {
	if (!--wrap)
	  q -= bsize;
	*++q = *p++;
      }
      newtail = q - S_iorec[1].ibuf;

    /* ...else use bcopy. */
    } else {
      /* --wrap and tail+1 because tail is `inc before access' */
      if (--wrap < free) {
	bcopy (p, S_iorec[1].ibuf + tail + 1, wrap);
	bcopy (p + wrap, S_iorec[1].ibuf, free - wrap);
	newtail = free - wrap - 1;
      } else {
	bcopy (p, S_iorec[1].ibuf + tail + 1, free);
	newtail = tail + free;
      }
      p += free;
    }

    /* the following has to be done with interrupts off to avoid
       race conditions. */
    {
      short sr = intsoff ();

      /* if the buffer is empty there might be no interrupt that sends
	 the next char, so we send it thru the xcon* vector. */
      if (S_iorec[1].ibufhd == S_iorec[1].ibuftl) {
	(void) xcon_exec (S_bioswrite, (unsigned char) ch);

	/* if the buffer now is still empty we must set the head pointer
	   to skip the 1st char (that we just sent). */
	if (S_iorec[1].ibufhd == S_iorec[1].ibuftl) {
	  if (++tail >= bsize)
	    tail = 0;
	  S_iorec[1].ibufhd = tail;
	}
      }
      S_iorec[1].ibuftl = newtail;

      intson (sr);
    }
    (void) Super (stack);

  /* if we may block loop until everything is written */
  } while (cwrite && qs->fread_blocking);

  return p - buf;
}
#endif /* MINT_FASTWRITE */
#endif /* __MINT__ */

 when this is constantly sending ~2000 cps on modem2 the process
(uucico) takes 4 or 5 % CPU according to `top'.  (megaSTe, uucp-g with
1K packets and window 7.  receiving takes more.)

 maybe people understand better now why i call unnecessary char <-> long
shuffling overhead... :-)

 cheers
	Juergen
-- 
J"urgen Lock / nox@jelal.north.de / UUCP: ..!uunet!unido!uniol!jelal!nox
								...ohne Gewehr
PGP public key fingerprint =  8A 18 58 54 03 7B FC 12  1F 8B 63 C7 19 27 CF DA