Instead of reading a packet header and the rest of the packet in two
separate i/o cycles, instead read a chunk of data at one time and then
parse as many packets as possible out of the chunk.
Also changes bgp_packet.c to batch process packets.
To avoid thrashing on useless mutex locks, the scheduling call for
bgp_process_packet has been changed to always succeed at the cost of no
longer being cancel-able. In this case this is acceptable; following the
pattern of other event-based callbacks, an additional check in
bgp_process_packet to ignore stray events is sufficient. Before deleting
the peer all events are cleared which provides the requisite ordering.
XXX: chunk hardcoded to 5, should use something similar to wpkt_quanta
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Move and modify all network input related code to bgp_io.c
* Add a real input buffer to `struct peer`
* Move connection initialization to its own thread.c task instead of
piggybacking off of bgp_read()
* Tons of little fixups
Primary changes are in bgp_packet.[ch], bgp_io.[ch], bgp_fsm.[ch].
Changes made elsewhere are almost exclusively refactoring peer->ibuf to
peer->curr since peer->ibuf is now the true FIFO packet input buffer
while peer->curr represents the packet currently being processed by the
main pthread.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
After implement threading, bgp_packet.c was serving the double purpose
of consolidating packet parsing functionality and handling actual I/O
operations. This is somewhat messy and difficult to understand. I've
thus moved all code and data structures for handling threaded packet
writes to bgp_io.[ch].
Although bgp_io.[ch] only handles writes at the moment to keep the noise
on this commit series down, for organization purposes, it's probably
best to move bgp_read() and its trappings into here as well and
restructure that code so that read()'s happen in the pthread and packet
processing happens on the main thread.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
On TCP connection failure during session setup, bgp_stop() checks
whether peer->t_read is non-null to know whether or not to unschedule
select() on peer->fd before calling close() on it. Using the API exposed
by thread.c instead of bgpd's wrapper macro BGP_READ_ON() results in
this thread value never being set, which causes bgp_stop() to skip the
cancellation of select() before calling close(). Subsequent calls to
select() on that fd crash the daemon.
Use the macro instead.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
During transition from OpenConfirm -> Established, we wipe the peer stub's
output buffer. Because thread.c prioritizes I/O operations over regular
background threads and events, in a single threaded environment this ordering
meant that the output buffer would be happily empty at wipe time. In MT-land,
this convenient coincidence is no longer true; thus we need to make sure that
any packets remaining on the peer stub get transferred over to the peer proper.
Also removes misleading comment indicating that bgp_establish() sends a
keepalive packet. It does not.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
If write() indicates that we should retry, just move along to the next
peer and come back later. No need to burn write() in a loop.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Changes all synchronization primitives to be dynamically allocated. This
should help catch any subtle errors in pthread lifecycles.
This change also pre-initializes synchronization primitives before
threads begin to run, eliminating a potential race condition that
probably would have caused a segfault on startup on a very fast box.
Also changes mutex and condition variable allocations to use
MTYPE_PTHREAD and updates tests to do the proper initializations.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Remove t_write
* Remove t_keepalive
These have been replaced by pthreads and are no longer needed. Since
some code looks at these values to determine if the threads are
scheduled, also add a new bitfield to store the same information.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Removes the WiP shim and implements proper thread lifecycle management.
* Declare necessary pthread_t's in bgp_master
* Define new MTYPE in lib/thread.c for pthreads
* Allocate and free BGP's pthreads appropriately
* Terminate and join threads appropriately
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
This patch, in tandem with moving packet writes into a dedicated kernel
thread, fixes session flaps caused by long-running internal operations
starving the (old) userspace write thread.
BGP keepalives are now produced by a kernel thread and placed onto the
peer's output queue. These are then consumed by the write thread. Both
of these tasks are concurrent with the rest of bgpd, obviating the
session flaps described above.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Prior to this change, after initiating a nonblocking connection to the
remote peer bgpd would call both BGP_READ_ON and BGP_WRITE_ON on the
peer's socket. This resulted in a call to select(), so that when some
event (either a connection success or failure) occurred on the socket,
one of bgp_read() or bgp_write() would run. At the beginning of each of
those functions was a hook into bgp_connect_check(), which checked the
socket status and issued the correct connection event onto the BGP FSM.
This code is better suited for bgp_fsm.c. Placing it there avoids
scheduling packet reads or writes when we don't know if the socket has
established a connection yet, and the specific functionality is a better
fit for the responsibility scope of this unit.
This change also helps isolate the responsibilities of the
packet-writing kernel thread.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Prior to this change, packets generated for update groups were taken off
of the (independent) buffer for the update group, reformatted for the
specific peer under question and sent off inline with bgp_write(). Since
the operations of this code path can include the merging and pruning of
subgroups and are too large to safely synchronize, this change moves
that logic to execute after each tick of the write thread.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
This should be allowed:
robot(config)# ip prefix-list outbound_asp_routes seq 33 permit 1.1.1.0/24 le 24
% Invalid prefix range for 1.1.1.0/24, make sure: len < ge-value <= le-value
This commit fixes the issue:
robot(config)# ip prefix-list outbound_asp_routes seq 33 permit 1.1.1.0/24 le 23
% Invalid prefix range for 1.1.1.0/24, make sure: len < ge-value <= le-value
robot(config)# ip prefix-list outbound_asp_routes seq 33 permit 1.1.1.0/24 le 24
robot(config)# ip prefix-list outbound_asp_routes seq 33 permit 1.1.1.0/24 le 25
robot(config)#
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
zebra_find_client needs to match on instance as well so
protocols like ospfd will work correctly for notification.
Modify the zebra_find_client code to accept the instance
number and to pass it in appropriately.
Signed-off-by: Doanld Sharp <sharpd@cumulusnetworks.com>
Add a daemon that will allow us to test the zapi
as well as test route install/removal times from
the kernel.
The current commands are:
install route <starting ip address> nexthop <nexthop> (1-1000000)
This command starts installing at <starting ip address>/32
(1-100000) routes that it auto-increments by 1
Installation start time is noted in the log and finish
time is noted as well.
remove routes <starting ip address> (1-1000000)
This command removes routes at <starting ip address>/32
and removes (1-100000) routes created by the install route
command.
This code can be considered experimental and *is not*
something that should be run in a production environment.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
EIGRP must not advertise routes that have failed to install.
This commit turns on the notification for EIGRP. We still
need to start handling this correctly.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When we are installing into the kernel, not the
change points for notification to a higher level
protocol and make it happen
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Allow the higher level protocol to specify if it would
like to receive notifications about it's routes that
it has installed.
I've purposely made it part of zclient_new_notify because
we need to track the routes on a per daemon basis only.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Provide ZAPI code that can pass to an upper level protocol
what happened to it's route on install.
There are these notifications:
1) ZAPI_ROUTE_FAIL_INSTALL - The route attempted to be
installed did not work.
2) ZAPI_ROUTE_BETTER_ADMIN_WON - A route that was installed
has become un-installed due to another routing protocol
installing a better admin distance
3) ZAPI_ROUTE_INSTALLED - The route specified has been installed
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
So we have the ability to apply speculative route-maps to
neighbor display to see what the changes would look like
via some show commands. When we do this we make a
shallow copy of the attr data structure and then pass
it around for applying the routemap. After we've applied
this route-map and displayed it we really need to clean
up memory that the route-map application applied.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When we are applying an experimental route-map type
to a peer( as part of a show command ) save and
restore the peer's ->rmap_type.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
When we display the advertised-routes and we don't have a route-map
to apply do not re-apply the route-map. This does two things:
1) Fixes a display issue with the show command.
2) More importantly stops leaking memory like a sieve for when
you have a full bgp table.
Fixes: #1345
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Display area x.x.x.x nssa configuration in
running-config. Using nssa translate candiate (default)
case to display 'area x nssa'.
Ticket:CM-18947
Reviewed By:
Testing Done:
Tried various combinations of nssa config,
verified show running-config ospfd output
router ospf
area 2.2.2.2 nssa
area 2.2.2.2 nssa no-summary
router ospf
area 2.2.2.2 nssa translate-always
area 2.2.2.2 nssa no-summary
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Define JSON_C_TO_STRING_NOSLASHESCAPE used for
escaping forward slash.
Disply json output for
'show ip ospf route [vrf all] json'
Ticket:CM-18659
Reviewed By:
Testing Done:
Configure multiple non-default VRF, inject external routes
via redistribute to ospf area.
checked show ip ospf route vrf all /json based output.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
show running-config to display VRF aware ospf instances
even if VRF is not active. This will allow the user to
configured ospf instances configurations even if VRF is not
active. 'show ip ospf vrf all' does not display until VRF
is active.
Ticket:CM-18949
Reviewed By:
Testing Done:
Configure non-default vrf aware ospfs with prior vrf devices
configured.
All vrf aware 'router ospf' displayed in running-configuration.
Disable one of the vrf device still all vrf aware 'router ospf'
displayed in running-config, but 'show ip ospf vrf all' does
not display for which VRF is not active.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>