Hello,
While experimenting with the `wasm32-wasip2` target and CPython, I
discovered an issue with the `getaddrinfo()` implementation: it fails to
resolve the provided service into a port number, causing `sin_port` to
always be set to 0. This issue leads to failures in network-related
functions that rely on `getaddrinfo()`, such as Python's `urllib3`
library, which passes the result directly to `connect()`. This results
in connection attempts using a port value of 0, which naturally fails.
### Minimal example to reproduce the problem
```c
#include <arpa/inet.h>
#include <netdb.h>
#include <stdio.h>
int main(void) {
struct addrinfo *res = NULL;
getaddrinfo("google.com", "443", NULL, &res);
for (struct addrinfo *i = res; i != NULL; i = i->ai_next) {
char str[INET6_ADDRSTRLEN];
if (i->ai_addr->sa_family == AF_INET) {
struct sockaddr_in *p = (struct sockaddr_in *)i->ai_addr;
int port = ntohs(p->sin_port);
printf("%s: %i\n", inet_ntop(AF_INET, &p->sin_addr, str, sizeof(str)), port);
} else if (i->ai_addr->sa_family == AF_INET6) {
struct sockaddr_in6 *p = (struct sockaddr_in6 *)i->ai_addr;
int port = ntohs(p->sin6_port);
printf("%s: %i\n", inet_ntop(AF_INET6, &p->sin6_addr, str, sizeof(str)), port);
}
}
return 0;
}
```
```
$ /opt/wasi-sdk/bin/clang -target wasm32-wasip2 -o foo foo.c
$ wasmtime run -S allow-ip-name-lookup=y foo
216.58.211.238: 0
2a00:1450:4026:808::200e: 0
```
Expected output:
```
216.58.211.238: 443
2a00:1450:4026:808::200e: 443
```
### Root Cause
The root cause is that `getaddrinfo()` does not correctly translate the
provided service into a port number. As described in the `getaddrinfo()`
man [page](https://man7.org/linux/man-pages/man3/getaddrinfo.3.html),
the function should:
> service sets the port in each returned address structure. If
this argument is a service name (see
[services(5)](https://man7.org/linux/man-pages/man5/services.5.html)),
it is
translated to the corresponding port number. This argument can
also be specified as a decimal number, which is simply converted
to binary. If service is NULL, then the port number of the
returned socket addresses will be left uninitialized.
### Proposed Fix
This pull request addresses the issue by implementing the following
behavior for `getaddrinfo()`:
* If the service is `NULL`, the port number in the returned socket
addresses remains uninitialized.
* The value is converted to an integer and validated if the service is
numeric.
The PR does not currently add support for translating named services
into port numbers because `getservbyname()` has not been implemented. In
cases where a named service is provided, the `EAI_NONAME` error code is
returned.
* implement basic TCP/UDP client support
This implements `socket`, `connect`, `recv`, `send`, etc. in terms of
`wasi-sockets` for the `wasm32-wasip2` target.
I've introduced a new public header file: `__wasi_snapshot.h`, which will define
a preprocessor symbol `__wasilibc_use_wasip2` if using the `wasm32-wasip2`
version of the header, in which case we provide features only available for that
target.
Co-authored-by: Dave Bakker <github@davebakker.io>
Signed-off-by: Joel Dice <joel.dice@fermyon.com>
* fix grammar in __wasi_snapshot.h comment
Co-authored-by: Dan Gohman <dev@sunfishcode.online>
Signed-off-by: Joel Dice <joel.dice@fermyon.com>
---------
Signed-off-by: Joel Dice <joel.dice@fermyon.com>
Co-authored-by: Dave Bakker <github@davebakker.io>
Co-authored-by: Dan Gohman <dev@sunfishcode.online>
* Start renaming preview1 to p1 and preview2 to p2
This is an initial start at renaming the "preview" terminology in WASI
targets to "pX". For example the `wasm32-wasi` target should transition
to `wasm32-wasip1`, `wasm32-wasi-preview2` should transition to
`wasm32-wasip2`, and `wasm32-wasi-threads` should transition to
`wasm32-wasip1-threads`. This commit applies a few renames in the
`Makefile` such as:
* `WASI_SNAPSHOT` is now either "p1" or "p2"
* The default p2 target triple is now `wasm32-wasip2` instead of
`wasm32-wasi-preview2` (in the hopes that it's early enough to change
the default).
* Bindings for WASIp2 were renamed from "preview2" terminology to "wasip2".
* The expected-defines files are renamed and the logic of which
expectation was used has been updated slightly.
With this commit the intention is that non-preview2 defaults do not
change. For example the default build still produces a `wasm32-wasi`
sysroot. If `TARGET_TRIPLE=wasm32-wasip1` is passed, however, then that
sysroot is produced instead. Similarly a `THREAD_MODEL=posix` build
produces a `wasm32-wasi-threads` sysroot target but you can now also
pass `TARGET_TRIPLE=wasm32-wasip1-threads` to rename the sysroot.
My hope is to integrate this into the wasi-sdk repository and build a
dual sysroot for these new targets for a release or two so both are
supported and then in the future the defaults can be switched away from
`wasm32-wasi` to `wasm32-wasip1` as built-by-default.
* Update builds in CI
* Update test workflow
* Fix test for wasm32-wasip1-threads
* Make github actions rules a bit more readable
* add descriptor table for mapping fds to handles
This introduces `descriptor_table.h` and `descriptor_table.c`, providing a
global hashtable for tracking `wasi-libc`-managed file descriptors.
WASI Preview 2 has no notion of file descriptors and instead uses unforgeable
resource handles. Moreover, there's not necessarily a one-to-one correspondence
between POSIX file descriptors and resource handles (e.g. a TCP connection may
require separate handles for reading, writing, and polling the same connection).
We use this table to map each POSIX descriptor to a set of one or more handles
and any extra state which libc needs to track.
Note that we've added `descriptor_table.h` to the
libc-bottom-half/headers/public/wasi directory, making it part of the public
API. The intention is to give applications access to the mapping, enabling them
to convert descriptors to handles and vice-versa should they need to
interoperate with both libc and WASI directly.
Co-authored-by: Dave Bakker <github@davebakker.io>
Signed-off-by: Joel Dice <joel.dice@fermyon.com>
* add dummy fields to otherwise empty structs
The C standard doesn't allow empty structs. Clang doesn't currently complain,
but we might as well stick to the spec in case it becomes more strict in the
future.
Signed-off-by: Joel Dice <joel.dice@fermyon.com>
* move descriptor_table.h to headers/private
We're not yet ready to commit to making this API public, so we'll make it
private for now.
I've also expanded a comment in descriptor_table.c to explain the current ABI
for resource handles.
Signed-off-by: Joel Dice <joel.dice@fermyon.com>
* re-run clang-format to fix indentation
Signed-off-by: Joel Dice <joel.dice@fermyon.com>
---------
Signed-off-by: Joel Dice <joel.dice@fermyon.com>
Co-authored-by: Dave Bakker <github@davebakker.io>
* Use constructor functions for optional init routines.
Instead of using weak symbols, use constructor function attributes for the
environment and preopen initialization routines. This is simpler, uses
less code, and is more LTO-friendly.
* Change the constructor priorities to start at 50.
We don't currently have specific plans for other levels in the reserved
range (0-100), so leave room for both lower and higher priorities.
* Make __wasi_linkcount_t a uint64_t (#134)
Refs: https://github.com/WebAssembly/WASI/pull/127
* Generate the WASI interface from witx.
This replaces the hand-maintained <wasi/core.h> header with a
<wasi/api.h> generated from witx.
Most of the churn here is caused by upstream WASI renamings; hopefully
in the future ABI updates will be less noisy.
* Link `populate_environ` only if we actually need environment variables.
This avoids linking in the environment variable initialization code,
and the __wasi_environ_sizes_get and __wasi_environ_get imports, in
programs that don't use environment variables.
This also removes the "___environ" (three underscores) alias symbol,
which is only in musl for backwards compatibility.
* Switch to //-style comments.
* If malloc fails, don't leave `__environ` pointing to an uninitialized buffer.
* Fix a memory leak if one malloc succeeds and the other fails.
* Use calloc to handle multiplication overflow.
This also handles the NULL terminator.
* Don't initialize __environ until everything has succeeded.
* Avoid leaking in case __wasi_environ_get fails.
* Handle overflow in the add too.
* Add #include <stdlib.h> for malloc etc.
* If the environment is empty, don't allocate any memory.