cnet Frequently Asked Questions

This file contains a number of Frequently Asked/Answered Questions (FAQs) about the cnet network protocol simulator. Please read this file first to ensure that you fully understand what is happening and then be able to anticipate any errors you may confront.

Questions answered in this file:

  1. How do I get started with cnet?
  2. How do I compile my cnet protocol source files?
  3. Are there any simple tricks that can help my understanding?
  4. What is the CHECK function that appears in most examples?
  5. What are timers and CnetTimers?
  6. What is the difference between a node's number and its node address?
  7. How can I debug my cnet programs?
  8. Why does cnet terminate after 3 minutes?
  9. What is the meaning of "spelling mistake on line 83 of protocol.c"?
  10. What is the meaning of "caught signal number while (last) handling ......."?
  11. How can I speed up cnet?
  12. How do I collate cnet statistics for plotting?

Please email chris@cs.uwa.edu.au if you find any errors in this file, or think that some additional material should be added.


How do I get started with cnet?

Firstly, you should read the Unix manual entry for cnet. You can read this by issuing the command:

                       man cnet

on your Unix machine, taking note of any ``local conditions'' such as the location of cnet's files and examples. The manual entry outlines the capabilities of cnet, lists the many options available, and provides prototypes for all available cnet functions.

There are also a number of on-line WWW pages describing cnet, in particular:

Having read the manual entry, read the cnet specific header file (typically installed at /usr/local/include/cnet.h). All cnet programs must include the line

                       #include <cnet.h>

to include the contents of this file. In particular, it is important that you understand the CnetNodeinfo and CnetLinkinfo structures, the CnetEvent and CnetError enumerated types and all function prototypes.


How do I compile my cnet protocol source files?

You do not need to compile your protocol files yourself. Simply executing

cnet TOPOLOGY

will cause cnet to locate and compile your ANSI-C files and produce a shared object file in your working directory (e.g. protocol.cnet). This shared object is then dynamically linked with the cnet simulator at run-time.

The system's standard C compiler is used, such as GNU's gcc. All C error messages are reported by the compiler (not cnet). All fatal and warning messages should be eliminated before your protocol code can be considered syntactically correct. You will probably receive many more error messages than you've experienced before - the reason being that the compiler is invoked with extra compilation switches to be very pedantic (this is good for your soul and in fact is how you should always compile C code). If you are concerned about any ``black magic'' destroying your code, observe what happens by invoking cnet as:

cnet -v TOPOLOGY


Are there any simple tricks that can help my understanding?

Many people get confused with cnet's apparent ability to manage multiple nodes simultaneously within a single process (which is, in fact, one of its unique features). In particular, it can be initially confusing to understand how a single protocol can act as both a sender and receiver simultaneously. A simple trick to ease this confusion is to only allow one node to transmit and the other to receive (in a network of just 2 nodes). As nodes have nodenumbers of 0, 1, 2, ... adding the lines

    if(nodeinfo.nodenumber == 0)
	(void)CNET_enable_application(ALLNODES);

to your handler for EV_REBOOT, typically reboot_node(), will only permit node #0 to generate messages for delivery.


What is the meaning of "spelling mistake on line 83 of protocol.c"?

There is a spelling mistake on line 83 of file protocol.c


What is the meaning of "caught signal number 10 while (last) handling Perth.EV_APPLICATIONREADY"?

Old tricks for young players.

Fatal error messages of this form generally indicate a major problem with your protocols. The number (typically 2, 10 or 11) refers to a Unix signal number intercepted by cnet (see /usr/include/signal.h). For example, signal number 2 will be caught and reported by cnet if you interrupt cnet from the shell level (signal 2 = SIGINT). The other common signals, 10 and 11, reveal significant flaws in your protocols.

Signal 10 (SIGBUS, a bus error) occurs when the CPU attempts to execute an instruction not on an instruction boundary (on many architectures, you've requested to execute an instruction whose address is not a multiple of 4). This error will generally occur when your programming corrupts your program's stack and, in particular, you corrupt the return address of the currently executing function. When the current function attempts to return (to a now incorrect address) and then fetches an instruction whose address is invalid, signal 10 will result.

Signal 11 (SIGSEGV, segmentation violation) occurs when your program attempts to address memory that has not been mapped into your address space. Typically, by accessing a pointer that has not been correctly initialized or has been modified/overwritten incorrectly, that pointer will point to memory that you do not ``own'', it being owned by either the operating system or another (person's) process. When attempting to access outside of your memory segment, you will get a segmentation violation. Operating systems that do not provide memory protection (segmentation), for example DOS, will not report this area as the (single) process on those operating systems "own" all of the address space. Your program there will still (maybe!) exhibit errors but these may not be reported to you. Unix is in fact doing you a favour.

Signals 10 and 11 spell disaster for your programs - there is obviously something seriously wrong with your program if they happen. Both forms of error most frequently occur when you are incorrectly managing pointers and/or dynamic memory.

Such problems are very difficult to diagnose - your first action should be to check your programming logic. By their nature, errors which often *cause* signals 10 and 11 to be reported, do not necessarily raise the signal immediately. You may do the wrong thing many instructions or even seconds before the signal is reported. For this reason, the best cnet can do is state which event handler it was executing (or it was most recently executing) when the signal occurs. This does not necessarily indicate that your programming error is in that event handler though experience shows that this is likely.


What is the CHECK function that appears in most examples?

CHECK is actually not a function provided by cnet (or UNIX) but a C macro defined in the cnet header file.

Most of cnet's library (builtin) functions return 0 on success and something else, typically -1, on failure. In fact, if any of these functions fail, it probably indicates a serious error in a protocol (there are a few exceptions to this generalization, such as cancelling a timer that has already expired). Moreover, all functions will set the global error variable cnet_errno on failure and this may be used as a index into the globally accessible array of error strings, cnet_errstr. This is similar to the use of errno and sys_errlist in ANSI-C.

By enveloping most calls to cnet's library routines we can get an accurate and immediate report on the location (source file + line number + nodename) and type of each error. If using the GNU C compiler, we can also determine the function name in which the error occurred. These helpful values are passed to cnet's function CNET_check which, if able, pops up a window highlighting the file and line number of the runtime error.

Looking at the definition of CHECK may expose the "black magic":


#if	defined(__GNUC__)
#define CHECK(call)   if((call) != 0) { \
                        CNET_exit(__FILE__,__FUNCTION__,__LINE__);}
#else
#define CHECK(call)   if((call) != 0) { \
                        CNET_exit(__FILE__,(char *)NULL,__LINE__);}
#endif
CHECK may not strictly belong in cnet's header file, but it's such a useful macro, it saves everyone re-inventing the wheel.


What are timers and CnetTimers?

The event-driven nature of cnet means that your protocols cannot simply 'wait' for something to happen. The cnet scheduler will inform your protocols when important things need doing (messages to deliver, frames to receive, etc). In particular, your protocols cannot simply wait a nominated period of time and then do something appropriate after that time.

YOUR PROTOCOLS SHOULD NOT CALL sleep() or any similar functions. Instead, cnet provides timers so that the scheduler may inform your protocol when a nominated period of time has elapsed. You may have many timers quietly ``ticking away'' - they are uniquely identified by a CnetTimer.

When you create a new timer you must indicate one of 10 timer queues EV_TIMER1..EV_TIMER10 and a period of time (in milliseconds) in the future. The function CNET_start_timer will return to you a CnetTimer so that you may keep track of the which timer has expired when your timer event handler is invoked. For example:

    CnetTimer save_timer;

    save_timer = CNET_start_timer(EV_TIMER1, 1000/* ms */, 0);

will cause the event handler for EV_TIMER1 to be called in 1 second. The value of save_timer will be passed as the second parameter to the handler so that you can see which timer expired. You can have as many outstanding timers on the EV_TIMER1 queue as you want.

If you decide that you no longer want to be informed when a timer expires, you should call CNET_stop_timer with the CnetTimer of the timer in which you are no longer interested. For example,

    (void)CNET_stop_timer(save_timer);

If the cnet scheduler invokes your timer handler, then you do not need to cancel the corresponding timer (it will be done for you).


What is the difference between a node's number and its node address?

Nodes have both a number and an address - node numbers (available in nodeinfo.nodenumber) range from 0,1,2,.....NNODES-1, whereas each node's address (available in nodeinfo.nodeaddress) can be any unique non-negative value. By default node numbers and node addresses are the same (0,1,2,....).

Setting a node address attribute in the topology file, as with

    host Perth {
        address     = 351
        ....
    }
should reveal a problem if your protocols are assuming that node numbers and node addresses are always the same. In particular, the destination node addresses returned by CNET_read_application and expected by CNET_write_direct are node addresses and not node numbers.


How can I debug my cnet programs?

Because many things appear to be happening simultaneously in cnet, debugging can be difficult. All output to C's implicit standard output stream appears on each node's output window. Output to C's standard error stream will appear on the invoking shell window (tty or pty).

Each node's standard output stream can be copied to an individual file using the -o option to cnet. For example, if running a two node network with

cnet -o debug TOPOLOGY

all output will be copied (typically) to the files debug.node0 and debug.node1.

Most importantly, most cnet functions return an integer indicating their success or failure (0 for success, -1 for failure). IT IS ESSENTIAL that you examine the function's return value to ensure that it performed as expected. If you ignore this return value your protocols may fail at a much later stage in their execution and it will be extremely difficult to track down your error. If a function detects an error (and returns -1) it will also set the node-specific variable cnet_errno to reflect what went wrong. The most recent error detected by cnet may then be printed from each node (to stderr) with the function cnet_perror or you may construct your own error messages using the error descriptions in *cnet_errname[] or *cnet_errstr[].

It is also helpful to trace your protocols to see the exact ordering and arguments of cnet function calls. Tracing may be selected with the -t command line option, setting the trace node attribute to true for all or individual nodes in the topology file or by selecting the trace checkbox on either the default or specific node panels under X-windows. Tracing will appear on the stderr stream of cnet (typically the shell's tty) and shows each node's event handling functions being invoked (and returned from) and, within each handler, all function calls, their arguments and the function return values. Any function arguments that are modified by the functions (arguments passed by reference) are also shown after the function return values. If any errors are detected by the functions themselves, these will be reported within the trace.


Why does cnet terminate after 3 minutes?

Errors in your protocols which prevent an event handler from returning when expected, prevent the cnet scheduler from performing correctly. In particular, the scheduler cannot service events from the X-window system - for example your requests to kill cnet itself when you detect a problem. To overcome this major problem, cnet itself times-out after 3 minutes just in case you have written incorrect protocols which have become 'stuck'. Once you are confident that your protocols are working as expected you can easily extend this 3 minute period with, for example,

cnet -m 10 TOPOLOGY

where the command line option indicates the required number of minutes.


How can I speed up cnet?

  1. Don't print out volumes of debug information to each node's output window. The printing of large amounts of text and scrolling these windows obviously slows down cnet. Print out bad news, not good news.
  2. If you think your protocol works ``for a few minutes'' but then dies a death, change a few attributes to speed up the cnet world. For example:
    
        messagerate      = 3ms
        propagationdelay = 1ms
    
    
    should make your protocol ``work for a few seconds'' Much better.
  3. If you'd rather not wait a full second for cnet to complete one second of network activity, run with the -T option to force events to be scheduled immediately (nodeinfo.time_in_ms is updated as you'd hope).
  4. You don't need an X-terminal to run cnet; it'll run from either a PC or an ASCII terminal and detect that it is not running under X. Then you can use the -o option or explicitly send output to stdout, stderr or a file rather than expecting it to appear in its own stdout window. After a while the gimmick of the ``network map'' should've worn off and you should only be debugging bad news, ala cnet_errno and cnet_perror().


How do I collate cnet statistics for plotting?

cnet centrally collates statistics on behalf of all nodes, and displays these on the 'Statistics' popup window or at the end of a simulation run if cnet is invoked with the -s option (or the -z option to also get zero-valued statistics).

We can also print statistics more frequently (periodically) with the correct choice of command line options. These are:

-X no need for X-windows
-T run the simulation as fast as possible
-M 5 run for 5 minutes of simulation time
-s yes, we want statistics
-f 10 print statistics with a frequency (period) of 10 seconds

This will produce volumes of output to cnet's standard output stream, so we need to both capture this and probably filter only what we need. So, to capture the Efficiency measure (bytes AL/PL) every second (in the hope that it improves), we issue:

  #!/bin/sh
  #
  cnet -X -T -M 5 -s -f 1 topologyfile     | \
  grep Efficiency                          | \
  cut -d: -f2                              | \
  cat -n              > result.file
The last line takes its input (a column of 300 efficiencies) and places a line number at the beginning of each line. This is fine if we really want statistics every single second, but slowly adapting protocols may take several minutes to reach their optimum. We could develop a shellscript which accepts arguments indicating the topology file and the frequency of collection:

  #!/bin/sh
  #
  TOPFILE=$1
  FREQ=$2 
  #
  cnet -X -T -M 5 -s -f $FREQ $TOPFILE      | \
  grep Efficiency                           | \
  cut -d: -f2                               | \
  awk '{ printf("%d %s\n", n+=freq, $0); }' freq=$FREQ  > result.file

Of course, other shellscript arguments could indicate the required statistic, resultfile, etc.


cnet was written and is maintained by Chris McDonald (chris@cs.uwa.edu.au)