(Updated Fri 2022-04-29) (Updated Wed 2022-05-11)

Opinionated Programming

I was asked how I might incorporate Open-MPI into a Fexl program. I had never seen Open-MPI before, so I started looking through the man pages at Open-MPI.

It's massive. I'm not yet clear entirely what it does, beyond generally providing some kind of message passing interface (MPI). Therefore I can only answer generally for now.

I designed Fexl to be the thinnest functional programming layer on top of C that I could conceive of. I've been writing C code for over 40 years and I love it, but I wanted high level functions, strings, tuples, lists, and automatic memory management. I wanted the best of both worlds: low-level C code and high-level lambda expressions.

An extreme example

Let's say you have a project where you need to integrate Open-MPI to do some concurrent messaging. At one extreme, you could just write every bit of your code directly in C. For example:

#include <project.h>

int main(int argc, const char *argv[])
    {
    project(argc,argv);
    return 0;
    }

Now let's say you had already done that, and now wanted to "integrate" your already completed project into Fexl. I won't ask why, and I won't laugh. You start by writing this C routine which calls your entire top-level project:

#include <project.h>
#include <value.h> /* Prerequisite for headers below. I never nest headers. */

#include <basic.h>    /* QI */
#include <type_run.h> /* main_argc main_argv */

value type_project(value f)
    {
    (void)f;                      /* The Fexl function itself takes no arguments. */
    project(main_argc,main_argv); /* Call project with command line args. */
    return hold(QI);              /* Return identity function to continue. */
    }

Then you edit the type_standard.c file in the Fexl project to define a symbol "project" which you can call from a Fexl program. First include your header file near the top:

#include <project.h>

Then add this line to the standard routine defined in that file:

    if (match("project")) return Q(type_project);

Now you can write a Fexl script "project.fxl" which does nothing but call the new project function you just defined:

project

Then you can call that script from the command line shell:

$ fexl project.fxl

You could even disguise the fact that you're using Fexl, making it into an ordinary executable file named "project", using the "shebang" feature:

#!/path/to/fexl/bin/fexl
project

Now you can run your project this way and no one is the wiser:

$ ./project

In this example, all you've done is provide a way to call your top-level project code from within Fexl. Congratulations, you've just written a single Fexl function that does everything you want!

I know this example is a bit silly, but it underscores the point that Fexl is meant to co-exist with C. If someone asks me what language I used for a project, I like to say tongue-in-cheek "It's written entirely in C, but with powerful configuration files."

Another extreme example

The previous example is at one extreme end of the spectrum. At the other extreme, you go about laboriously trying to integrate every … single … routine in the entire Open-MPI interface into Fexl, and create Fexl versions of every … single … data type in Open MPI, including MPI_File, MPI_Offset, etc.

I don't recommend it. At all. Please don't do it. I recommend that you start somewhere between the two extremes. Figure out what you really want your Fexl code to look like at a high level, and then write C code which uses only as much of the Open-MPI libary as you need. Become highly opinionated about what your message processing interface should look like within Fexl, and then grudgingly go into the C code and call Open-MPI to make that happen and nothing else.

Opinionated programming

As an example, Fexl provides a function run_process which runs a piece of Fexl code in a separate child process, returning handles to its stdin and stdout so the parent process can interact with the child. In this case the child's stderr goes to the same destination as the parent's stderr, which is typically what you want when implementing a server with an error log.

If instead you need to read the child's stderr, there's a different function spawn which will give you that file handle as well.

Now if I wanted my life to be as needlessly difficult as possible, I would go about writing Fexl interfaces to every … single … low-level C routine and data type related to file handles, pipes, catching signals, and forking processes. Then I would write run_process and spawn in terms of those very low-level Fexl functions.

That would be an extraordinarily bad move. Instead, because I had formed a nice clean opinion about what run_process and spawn should do, I just set about writing precisely the C code needed to implement those exact functions.

Here's the code in the "type_run.c" file:

/* (run_process fn_child fn_parent)

Interact with the fn_child function as a separate process, with the fn_parent
receiving handles to the child's stdin and stdout.

The child's stderr goes to the same destination as the parent's stderr, which
is typically what you want when implementing a server with an error log.
*/
value type_run_process(value f)
    {
    return op_process(f,0);
    }

/* (spawn fn_child fn_parent)

Interact with the fn_child function as a separate process, with the fn_parent
receiving handles to the child's stdin, stdout, and stderr.
*/
value type_spawn(value f)
    {
    return op_process(f,1);
    }

As it turns out, there was so much code in common between the two routines that I implemented both in terms of a single op_process routine that takes a boolean flag. And that's where the rubber meets the road:

/* Interact with the fn_child function as a separate process.

If catch_stderr is true, evaluate:
    (fn_parent child_in child_out child_err)

If catch_stderr is false, evaluate:
    (fn_parent child_in child_out)

The child_in, child_out, and child_err are file handles for the child's stdin,
stdout, and stderr respectively.

That evaluation performs an interaction using the file handles, returning a
handler function which receives the child's exit status when the child process
terminates.
*/
static value op_process(value f, int catch_stderr)
    {
    if (!f->L || !f->L->L) return 0;
    {
    /* Flush the parent's stdout and stderr to prevent any pending output from
    being accidentally pushed into the child's input.  I've noticed this can
    happen when the script output is sent to a file or pipe instead of a
    console.
    */
    fflush(stdout);
    fflush(stderr);

    {
    /* Create a series of pipes, each with a read and write side. */
    int fd_in[2];
    int fd_out[2];
    int fd_err[2];

    do_pipe(fd_in);
    do_pipe(fd_out);
    if (catch_stderr)
        do_pipe(fd_err);

    {
    pid_t pid = fork();
    if (pid == -1) die("fork failed");

    if (pid == 0)
        {
        /* This is the child process. */

        /* Duplicate read side of input pipe to stdin. */
        do_dup2(fd_in[0],0);
        /* Duplicate write side of output pipe to stdout. */
        do_dup2(fd_out[1],1);
        if (catch_stderr)
            {
            /* Duplicate write side of error pipe to stderr. */
            do_dup2(fd_err[1],2);
            }

        /* Close unused file handles.  They're actually all unused because I
        duplicated the ones I still need.  At a minimum, I must close the write
        side of the input pipe, otherwise the child hangs waiting for stdin to
        close. */

        do_close(fd_in[0]);
        do_close(fd_in[1]); /* Must do this one to avoid hang. */

        do_close(fd_out[0]);
        do_close(fd_out[1]);

        if (catch_stderr)
            {
            do_close(fd_err[0]);
            do_close(fd_err[1]);
            }

        /* Evaluate the child function. */
        drop(eval(hold(f->L->R)));

        /* Exit here to avoid continuing with evaluation.  This means that
        memory leak detection does not occur for the child function.  If you
        want that level of checking, you should exec with (argv 0) instead. */
        exit(0);
        return 0;
        }
    else
        {
        /* This is the parent process. */

        /* Open write side of input pipe as child input. */
        FILE *child_in = do_fdopen(fd_in[1],"w");
        /* Open read side of output pipe as child output. */
        FILE *child_out = do_fdopen(fd_out[0],"r");
        FILE *child_err = 0;
        if (catch_stderr)
            {
            /* Open read side of error pipe as child error. */
            child_err = do_fdopen(fd_err[0],"r");
            }

        /* Close unused file handles.  I don't close the ones I just opened
        because they are still in play (i.e. fdopen does not dup). */

        do_close(fd_in[0]);  /* Close the read side of the input pipe. */
        do_close(fd_out[1]); /* Close the write side of the output pipe. */
        if (catch_stderr)
            do_close(fd_err[1]); /* Close the write side of the error pipe. */

        {
        value handler = A(A(hold(f->R),Qfile(child_in)),Qfile(child_out));
        int status;
        if (catch_stderr)
            handler = A(handler,Qfile(child_err));

        handler = eval(handler);
        status = do_wait(pid);
        return A(handler,Qnum(status));
        }
        }
    }
    }
    }
    }

The rabbit hole contines down into some lower level routines, such as:

static void do_pipe(int fd[])
    {
    int status = pipe(fd);
    if (status == -1) die("pipe failed");
    }

static void do_dup2(int oldfd, int newfd)
    {
    int status = dup2(oldfd,newfd);
    if (status == -1) die("dup2 failed");
    }

Look how opinionated that is. I decreed that if the pipe or dup2 calls failed, then your whole process would die right then. I could do otherwise, but I do not.

Now can you even imagine what would be involved with trying to implement Fexl versions of all the low-level system routines like pipe or dup2? I'd have to represent all the various data types and flag values, and of course hook into perror to give you the option to halt the process that way.

As for me, I do not want those options. I prefer to restrict myself voluntarily to fewer and better options, wisely chosen in light of my experience.

A simpler example

Fexl has an exec call, which you can see in type_exec in the type_run.c file. It calls the low-level execv system routine, which requires a C array of string pointers. To interface from Fexl, the caller provides a Fexl list, e.g.

exec ["/bin/sh" "-c" "pwd"]

To implement type_exec, I had to write C code to convert the Fexl list into a low-level C array of string pointers. So that's an example of "marshalling" your data into whatever a low-level routine requires.

Conclusion

So when you're integrating Fexl with Open-MPI, I suggest that you follow this basic approach. Figure out how you want it to look from a Fexl program and go from there.