Here’s a preview of something I’ve been working on, that’s almost finished. It’s a version of the Stage robot simulator that can be loaded into Python as a module, rather than running as a server in a separate process. My motivation for this is to be able to run simulations in batch mode as fast as possible. The standard Stage server is designed as a backend to Player, the generic robot driver and interface system that uses a networked client/server communications module. Essentially, Stage runs as a set of Player drivers, so that the fact that it’s a simulator is abstracted away, making the simulator virtually indistinguishable from a real robot. This is a great model for those who are using the simulator to test and debug code that will eventually be ported to a real robot. The downside is for those of us who want to use a simulator to speed up time, so we can run simulations that would be prohibitively long on physical robots (days or weeks).
Stage does allow speedup using the
-u flags, but since it’s running as a server communicating through a socket, it’s difficult to maintain synchronization if it’s sped up more than 3 or 4x. In my new library, that I’m tentatively calling PyStage (like PyPlayer), the Python program that’s running the “client” also controls exactly when the world’s Update() method gets called, and so it can take as long or as short as it wants between world updates.
This single-process, synchronized execution model provides some added benefits besides just running fast. First, Running in a single process w/o network communication allows simulation runs to be submitted to distributed batch queueing systems like Condor, easily allowing many simulations to run in parallel. While Condor does allow some limited network communication in a job, keeping a TCP connection open to a simulator process for a whole job isn’t really feasible. Second, the synchronization should make it possible to use Stage with simulations of cognitive or perceptual processes that are too computationally intensive to run in real time on current computing hardware (e.g. large scale neural simulations). In this case, instead of speeding up time in the simulator, you could slow down time, allowing a cognitive process that might take many seconds to simulate to only use a fraction of a second of simulator time. Again, this is technically feasible with client/server Stage, but for it to work you need to know in advance how long to make each cycle, and if there’s variability in computation time from one cycle to another, you need to add extra “padding” to the cycle time that will often go wasted.
I now have my initial prototype working, and have managed to run a few simulations, getting about a 3-4x speedup in raw cycle time (100-120Hz vs 33hz) over my client-server runs, however because of some other changes in the simulator, I’m able to do computation on every cycle (as opposed to about every 3 cycles before), for an effective speedup of 9-12x. Of course, then the top-speed was communications-limited, and now it’s processor-limited, so I should be able to get even more speedup on faster machines. Once the code has been cleaned up, and I learn to use the GNU build system, I should be able to release a 0.1 version (or should it be 1.0?). Or, if the Stage authors are willing, I’d be willing to fold the whole thing into Stage proper. Right now, I maintain my own slightly modified version of the Stage source for use with PyStage. Another possibility is to expand to allow clients to be written run in other languages like Java or Scheme. Since I used SWIG to generate the python wrappers, this shouldn’t be too difficult, though someone else would have to do the non-Python-specific work.
Footnote: This whole experience has been a great reinforcement of the value of Free Software. Because the source code of the program was free, I was able to modify it to serve a need that the authors hadn’t anticipated, and probably would have considered low- or no-priority, and the whole process only took me a couple of weeks. I can’t imagine this would have been possible if I’d been using a closed, proprietary system.