Add support for hot reload. It should be used in order for resource
updates to take place.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Add support for devlink resource abstraction. The resources are
represented by a tree based structure and are identified by a name and
a size. Some resources can present their real time occupancy.
First the resources exposed by the driver can be observed, for example:
$devlink resource show pci/0000:03:00.0
pci/0000:03:00.0:
name kvd size 245760 unit entry
resources:
name linear size 98304 occ 0 unit entry size_min 0 size_max 147456 size_gran 128
name hash_double size 60416 unit entry size_min 32768 size_max 180224 size_gran 128
name hash_single size 87040 unit entry size_min 65536 size_max 212992 size_gran 128
Some resource's size can be changed. Examples:
$devlink resource set pci/0000:03:00.0 path /kvd/hash_single size 73088
$devlink resource set pci/0000:03:00.0 path /kvd/hash_double size 74368
The changes do not apply immediately, this can be validate by the 'size_new'
attribute, which represents the pending changed size. For example
$devlink resource show pci/0000:03:00.0
pci/0000:03:00.0:
name kvd size 245760 unit entry size_valid false
resources:
name linear size 98304 size_new 147456 occ 0 unit entry size_min 0 size_max 147456 size_gran 128
name hash_double size 60416 unit entry size_min 32768 size_max 180224 size_gran 128
name hash_single size 87040 unit entry size_min 65536 size_max 212992 size_gran 128
In case of a pending change the nested resources present an indication
for a valid configuration of its children (sum of its children sizes
doesn't exceed the parent's size).
In order for the changes to take place hot reload is needed. The hot
reload through devlink will be introduced in the following patch.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Currently multi-line objects are separated by new-lines. This patch
changes this behavior by using indentations for separation.
Signed-off-by: Arkadi Sharhsevsky <arkadis@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
kill_inet_sock() expects rhn_handle instance is passed
via inet_diag_arg argument. However on the following calling path:
generic_show_sock
=> show_one_inet_sock
=> kill_inet_sock
rth field of inet_diag_arg is not filled with the address of
rhn_handle instance. As the result ss crashes.
This commit fills the field with newly created rhn_handle
instance.
Changes in v2:
Instead of creating rtn_handle instances for each socket, create
one in upper layer and reuse it.
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The "Information" link was removed from README file in commit
d7843207e6 ("README: update location of git repositories, remove
broken info link"), because it redirected to a page that no longer
existed on the Linux Foundation wiki.
This page has just been restored, so we can add the link back again.
Since the previous link was a redirection, use the updated link instead.
Thanks to Luca Boccassi for investigating this issue, restoring and
updating the page.
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Instead of declaring -color and -json exclusive, ignore -color when
-json is provided. The rationale is to allow to put -color in an alias
for ip while still being able to use -json. -color is merely a
presentation suggestion and we can assume there is nothing to color in
the JSON output.
Signed-off-by: Vincent Bernat <vincent@bernat.im>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The tc_print_action() function did not print all tc actions
when e.g. TCA_ACT_MAX_PRIO actions were defined for a single
tc filter.
Signed-off-by: Adam Vyskovsky <adamvyskovsky@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Prevent a double space in "bridge mdb show" when the MDB entry is not
marked as "offload".
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
It will fail with EPERM on Linux 4.15.
Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Acked-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Since there is no documentation in Latex format left, there is no need
to check for commands to build it. Also there is no need to ignore any
of the temporary files which were created by them.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reflect the recent change of location for the git repositories, and the
creation of the -next development repo, in README and README.devel.
Also remove the link to the Linux Foundation wiki that contained
information about iproute2. The link is now broken, I did not find any
alternative page to point to.
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: David Ahern <dsahern@gmail.com>
If the kernel receives a negative nsid it will automatically assign
the next available nsid. In this case alloc_netid() will set min and
max to 0 for ird_alloc(). And when max == 0 idr_alloc() will interpret
this as the maximum range, i.e. specific to nsids it will try to find
an id in the range [0,INT_MAX). This is intentionally supported in the
kernel for nsids.
Commit acbe9118ce ("ip netns: use strtol() instead of atoi()")
regressed ip netns in that respect although previously the use-case
was either accidentally supported or opaquely supported such that it
triggered the original commit. From what I can gather it went as
follows before: atoi() was called with a string indicating a negative
value which caused it to return -1 which was passed to the
kernel. Let's make it less opaque by introducing the keyword "auto":
ip netns set <netns-name> auto
will cause nsid to be set to -1 and the kernel will select an available
nsid.
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Spartan version of resource tracking documentation.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
This patch adds ss-similar interface to view various resource
tracked objects. At this stage, only QP is presented.
1. Get all QPs for the specific device:
$ rdma res show qp link mlx5_4
link mlx5_4/- lqpn 8 type UD state RESET sq-psn 0 pid 0 comm [ib_ipoib]
link mlx5_4/1 lqpn 7 type UD state RTS sq-psn 0 pid 0 comm [ib_core]
link mlx5_4/1 lqpn 1 type GSI state RTS sq-psn 0 pid 0 comm [ib_core]
link mlx5_4/1 lqpn 0 type SMI state RTS sq-psn 0 pid 0 comm [ib_core]
$ rdma res show qp link mlx5_4/
link mlx5_4/- lqpn 8 type UD state RESET sq-psn 0 pid 0 comm [ib_ipoib]
link mlx5_4/1 lqpn 7 type UD state RTS sq-psn 0 pid 0 comm [ib_core]
link mlx5_4/1 lqpn 1 type GSI state RTS sq-psn 0 pid 0 comm [ib_core]
link mlx5_4/1 lqpn 0 type SMI state RTS sq-psn 0 pid 0 comm [ib_core]
2. Provide illegal port number (0 is illegal):
$ rdma res show qp link mlx5_4/0
Wrong device name
3. Get QPs of specific port:
$ rdma res show qp link mlx5_4/1
link mlx5_4/1 lqpn 7 type UD state RTS sq-psn 0 pid 0 comm [ib_core]
link mlx5_4/1 lqpn 1 type GSI state RTS sq-psn 0 pid 0 comm [ib_core]
link mlx5_4/1 lqpn 0 type SMI state RTS sq-psn 0 pid 0 comm [ib_core]
4. Get QPs which have not assigned port yet:
link mlx5_4/- lqpn 8 type UD state RESET sq-psn 0 pid 0 comm [ib_ipoib]
5. Limit to specific Local QPNs:
$ rdma res show qp link mlx5_4/1 lqpn 1-3,7
link mlx5_4/1 lqpn 7 type UD state RTS sq-psn 0 pid 0 comm [ib_core]
link mlx5_4/1 lqpn 1 type GSI state RTS sq-psn 0 pid 0 comm [ib_core]
link mlx5_4/1 lqpn 0 type SMI state RTS sq-psn 0 pid 0 comm [ib_core]
. Filter types (strings):
$ rdma res show qp link mlx5_4/1 type UD,gSi
link mlx5_4/1 lqpn 7 type UD state RTS sq-psn 0 pid 0 comm [ib_core]
link mlx5_4/1 lqpn 1 type GSI state RTS sq-psn 0 pid 0 comm [ib_core]
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The global resource summary information. The object names, current utilization
and maximum numbers are received as is from the kernel.
$ rdma res
1: mlx5_0: pd 3 cq 5 qp 4
2: mlx5_1: pd 3 cq 5 qp 4
3: mlx5_2: pd 3 cq 5 qp 4
4: mlx5_3: pd 2 cq 3 qp 2
5: mlx5_4: pd 3 cq 5 qp 4
$ rdma res show mlx5_4
5: mlx5_4: pd 3 cq 5 qp 44
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The dev and link execution callbacks expects that next
command line argument is device or port name.
Set pointer to device or port name position prior calls to
rd_exec_dev()/rd_exec_link().
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
This patch adds general infrastructure to RDMAtool to handle various
filtering options needed for the downstream resource tracking patches.
The infrastructure is generic and stores filters in list of key<->value
entries. There are three types of filters:
1. Numeric - the values are intended to be digits combined with '-' to
mark range and ',' to mark multiple entries, e.g. pid 1-100,234,400-401
is perfectly legit filter to limit process ids.
2. String - the values are consist from strings and "," as a denominator.
3. Link - special case to allow '/' in string to provide link name, e.g.
link mlx4_1/2.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
According to the IBTA spec [1], the physical connected port is provided
for the QP in RTR-to-INIT stage performed by modify_qp(). It causes
to do not have port number for newly created QPs.
The following patch adds "-" sign to present absence of port, because
QPs are going to be associated with rdmatool link object, which needs
port number as an index.
[1] InfiniBand Architecture Release 1.3 -
"Table 96 QP State Transition Properties"
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Non-JSON tc qdisc output used to print the "requeues" statistic
twice. Commit 4fcec7f366 ("tc: jsonify stats2") tried to preserve
this behaviour for both standard output and JSON, but used the wrong
statistic (q.qlen). Also duplicating keys in JSON is not allowed,
so the second occurrence should be completely skipped with JSON.
Fixes: 4fcec7f366 ("tc: jsonify stats2")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The JSON object name for statistics in ip link show is "stats644".
Looks like a typo, commit d0e720111a ("ip: ipaddress.c: add support
for json output") contains an example with the expected "stats64" name.
The fact that no one has noticed until now is probably an indication
that no one is using this object. Hopefully it's not too late to fix
this, although IIUC this has already been in 4.13 and 4.14 releases :S
Fixes: d0e720111a ("ip: ipaddress.c: add support for json output")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Make JSON output work with prio Qdiscs. This will also make
other qdiscs which reuse the print_qopt work, like mqprio or
pfifo_fast.
Note that there is a double space between "priomap" and first
prio number. Keep this original behaviour.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Make JSON output work with RED Qdiscs. Float/double printing
helpers have to be added/uncommented to print the probability.
Since TC stats in general are not split out to a separate object
the xstats printed by this patch are not separated either.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Serhey Popovych says:
====================
Now we enhance get_addr() to return additional information about address
(e.g. if it unspecified or multicast) we want to have same functionality
for attributes in netlink message.
Introduce and use get_addr_rta() that parses given netlink attribute
into @inet_prefix data structure in the same way similar get_addr()
parses address from it's string representation.
Use attribute length to guess address family: force it by giving non
AF_UNSPEC @family to get_addr_rta() to ensure address is of expected
family.
Introduce and use inet_addr_match_rta() to further simplify and unify
code where get_addr_rta() intended to be used together with
inet_addr_match().
This is next step in ipv4 and ipv6 modules unification to prepare for
merge in the future.
====================
Signed-off-by: David Ahern <dsahern@gmail.com>
Introduce and use tnl_print_endpoint() helper to print of tunnel
endpoint address.
Note that for AF_INET and AF_INET6 inet_ntop(3) is used that may return
NULL in case of failure and while unlikely format_host_rta() might
return NULL too. Handle this case when passing local/remote to
print_string().
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
While there remove & from inet_prefix.data when since it is array.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
While there check return from get_prefix() for filter address.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
While there check return from get_prefix() for filter address.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
While there check return from get_prefix() for filter address.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
While there check return from get_prefix() for filter address.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
First is used to get address from netlink attribute to
inet_prefix data structure. Use memcpy() with constant
value to let complier optimize by replacing a call by
inlining load/store instructions.
Second is used to match address in given netlink attribute
with one given as reference. It matches successfully if
no attribute is given (@rta is NULL), reference address
family is AF_UNSPEC or it's length isn't given; fails if
get_attr_rta() can't get attribute or it's family does
not match reference; calls inet_addr_match() to get final
verdict.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Serhey Popovych says:
====================
With this series I want to unify collect metadata
handling in tunnels:
1) Use "external" name for JSON and non-JSON output.
Do not *print* any options when tunnel in
collect metadata mode: gre6 already do
this, so just apply to others.
2) Do not *add* any attributes when configuring
gre tunnel in collect metadata mode.
Other tunnels (e.g. gre6, iptnl, ip6tnl)
alredy do that.
This is next step in ipv4 and ipv6 modules
unification to prepare for merge in the future.
Any comments, suggestions and criticism as always
welcome.
v2
For all tunnels implementing collect metadata
use "external" keyword for both JSON. Thanks
to Jiri Benc for detailed explanation.
====================
Signed-off-by: David Ahern <dsahern@gmail.com>
There are couple of minor improvements:
1) Check erspan_ver == 2 in gre6. It still could
be 1 if erspan_idx is 0.
2) Add tunnel encapsulation attributes only when
collect metadata not in effect in gre.
3) Trivial: address checkpatch issues.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Print only "external" if collect meta data attribute
is given: rest of parameters are irrelevant. This is
to follow gre6.
For both JSON and non-JSON output use "external" for
all tunnels including vxlan and geneve.
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>