Option 3: vfork() has existed for a long time. The child process temporarily borrows all of the parent's address space. The calling process is frozen until the child exits or calls a flavor of exec. Granted, it's pretty brittle and any modification of non-stack address space other than changing a variable of type pid_t is undefined behavior before exec is called. However, it gets around the disadvantages of fork() while maintaining all of the flexibility of Unix's separation of process creation (fork/vfork) and process initialization (exec*).
vfork followed immediately by exec gives you Windows-like process creation, and last I checked, despite having the overhead of a second syscall, was still faster than process creation on Windows.
vfork followed immediately by exec gives you Windows-like process creation, and last I checked, despite having the overhead of a second syscall, was still faster than process creation on Windows.