Commit Graph

216 Commits

Author SHA1 Message Date
Santosh P K
74e00a55c1 bgpd: BGP assert when it tries to access peer which is closed.
Problem: BGP peer pointer is present in keepalive hash table
even when socket has been closed in some race condition.
When keepalive tries to access this peer it asserts.

RCA: Below sequence of events causing assert.
1. Config node peer has went down due to TCP reset
   it's FD has been set to -1.
2. Doppelganger peer goes to established state and it has
   been added to peer hash table for keepalive when it was
   in openconfirm state.
3. Config node parameters including FD are exchanged with
   doppelganger. Doppelganger will not have FD -1.
4. Doppelganger will be deleted as part of this it will
   remove it from the keepalive peer hash table.
5. While removing from hash table it tries to acquire lock.
6. During this time keepalive thread has the lock and in
   a loop trying to send keepalive for peers in hash table.
7. It tries to send keepalive for doppelganger peer with fd
   set to -1 and asserts.

Signed-off-by: Santosh P K <sapk@vmware.com>
2019-12-09 09:10:57 -08:00
David Lamparter
2b64873d24 *: generously apply const
const const const your boat, merrily down the stream...

Signed-off-by: David Lamparter <equinox@diac24.net>
2019-12-02 15:01:29 +01:00
Donatas Abraitis
c8d6f0d6c4 bgpd: Replace magic number 1 for TTL to BGP_DEFAULT_TTL
For readability and maintainability purposes.

Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2019-11-27 10:48:17 +02:00
Donatas Abraitis
0e35025eb4 bgpd: Use BGP_NOTIFY_SUBCODE_UNSPECIFIC value for bgp_notify_send() where 0
Just a code cleanup to keep the code consistent.

Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
2019-11-10 17:54:37 +02:00
Lou Berger
ef5307f23f
Merge pull request #4861 from NaveenThanikachalam/logs
BGP: Rectifying the log messages.
2019-09-17 11:33:43 -04:00
Naveen Thanikachalam
4cb5e18ba5 BGP: Rectifying the log messages.
This change addresses the following:
1) Ensures logs under DEBUG macro checks are categorized
   as zlog_debug instead of zlog_info.
2) Error logs are categorized as zlog_err instead of zlog_info.
3) Rephrasing certain logs to make them appear more intuitive.

Signed-off-by: NaveenThanikachalam <nthanikachal@vmware.com>
2019-09-09 22:59:22 -07:00
Quentin Young
1ce14168b3
Merge pull request #4809 from martonksz/master
bgpd: hook for bgp peer status change events
2019-09-09 10:55:00 -04:00
Donald Sharp
11d443f591
Merge pull request #4925 from ddutt/master
bgpd: Fixes to error message printed for failed peerings
2019-09-03 20:36:53 -04:00
Dinesh G Dutt
05912a17e6 bgpd: Fixes to error message printed for failed peerings
There was a silly bug introduced when the command to show failed sessions
was added. A missing "," caused the wrong error message to be printed.
Debugging this led down a path that:
   - Led to discovering one more error message that needed to be added
   - Providing the error code along with the string in the JSON output
     to allow programs to key off numbers rather than strings.
   - Fixing the missing ","
   - Changing the error message to "Waiting for Peer IPv6 LLA" to
     make it clear that we're waiting for the link local addr.

Signed-off-by: Dinesh G Dutt <5016467+ddutt@users.noreply.github.com>
2019-09-03 19:55:49 +00:00
David Lamparter
00dffa8cde lib: add frr_with_mutex() block-wrapper
frr_with_mutex(...) { ... } locks and automatically unlocks the listed
mutex(es) when the block is exited.  This adds a bit of safety against
forgetting the unlock in error paths & co. and makes the code a slight
bit more readable.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2019-09-03 17:15:17 +02:00
Dinesh G Dutt
3577f1c54f bgpd: Add a new command to only show failed peerings
In a data center, having 32-128 peers is not uncommon. In such a situation, to find a
peer that has failed and why is several commands. This hinders both the automatability of
failure detection and the ease/speed with which the reason can be found. To simplify this
process of catching a failure and its cause quicker, this patch does the following:

1. Created a new function, bgp_show_failed_summary to display the
   failed summary output for JSON and vty
2. Created a new function to display the reset code/subcode. This is now used in the
   failed summary code and in the show neighbors code
3. Added a new variable failedPeers in all the JSON outputs, including the vanilla
   "show bgp summary" family. This lists the failed session count.
4. Display peer, dropped count, estd count, uptime and the reason for failure as the
   output of "show bgp summary failed" family of commands
5. Added three resset codes for the case where we're waiting for NHT, waiting for peer
   IPv6 addr, waiting for VRF to init.

This also counts the case where only one peer has advertised an AFI/SAFI.

The new command has the optional keyword "failed" added to the classical summary command.

The changes affect only one existing output, that of "show [ip] bgp neighbors <nbr>". As
we track the lack of NHT resolution for a peer or the lack of knowing a peer IPv6 addr,
the output of that command will show a "waiting for NHT" etc. as the last reset reason.

This patch includes update to the documentation too.

Signed-off-by: Dinesh G Dutt <5016467+ddutt@users.noreply.github.com>
2019-09-02 14:21:44 +00:00
vivek
e2d3a90954 bgpd: Fix nexthop reg for IPv4 route exchange using GUA IPv6 peering
In the case of IPv4 route exchange using GUA IPv6 peering, the route install
into the FIB involves mapping the immediate next hop to an IPv4 link-local
address and installing neighbor entries for this next hop address. To
accomplish the latter, IPv6 Router Advertisements are exchanged (the next hop
or peer must also have this enabled) and the RAs are dynamically initiated
based on next hop resolution.

However, in the case of a passive connection where the local system has not
initiated anything, no NHT entry is created for the peer, hence RAs were not
getting triggered. Address this by ensuring that a NHT entry is created even
in this situation. This is done at the time the connection becomes established
because the code has other assumptions that a NHT entry will be present only
for the "configured" peer. The API to create the entry ensures there are
no duplicates.

Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by:   Donald Sharp <sharpd@cumulusnetworks.com>
2019-08-18 22:12:06 -07:00
Marton Kun-Szabo
7d8d0eabb4 bgpd: hook for bgp peer status change events
Generally available hook for plugging application-specific
code in for bgp peer change events.

This hook (peer_status_changed) replaces the previous, more
specific 'peer_established' hook with a more general-purpose one.
Also, 'bgp_dump_state' is now registered under this hook.

Signed-off-by: Marton Kun-Szabo <martonk@amazon.com>
2019-08-13 11:59:27 -07:00
David Lamparter
584470fb5f bgpd: add & use bgp packet dump hook
The MRT dump code is already hooked in at the right places to write out
packets;  the BMP code needs exactly the same access so let's make this
a hook.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2019-07-03 16:58:26 +02:00
Donald Sharp
1cfe005d0c bgpd: Update an fsm debug message
When debugging I was having a hard time correlating some data and noticed that
a particular debug was not being very useful.

Signed-off-by: Donald Sharp <sharpd@cumulusnstworks.com>
2019-05-28 18:10:26 -04:00
Philippe Guibert
b83a6e054c bgpd: do not unregister bfd session when bgp session goes down
This commit fixes a previous commit:
"bfdd: remove operational bfd sessions from remote daemons"
where the handling of unregister call triggers the deletion of bfd
session.
Actually, the BFD session should not be deleted, while bgp session is
configured with BGP. this permits to receive BFD events up, and permit
quicker reconnecion.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2019-05-14 16:50:01 +02:00
Philippe Guibert
fc04a6778e bgpd: improve reconnection mechanism by cancelling connect timers
if bfd comes back up, and a bgp reconnection is in progress, theorically
it should be necessary to wait for the end of the reconnection process.
however, since that reconnection process may take some time, update the
fsm by cancelling the connect timer. This done, one just have to call
the start timer.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2019-04-18 16:11:51 +02:00
Donald Sharp
dded74d578 bgpd: Don't prevent views from being able to connect
Views are perfectly valid and should be allowed to connect.
In a bgp instance scenario the vrf_id will always be UNKNOWN,
so allow it.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2019-03-06 11:35:58 -05:00
root
36dc75886d bgpd: Creating Loopback Interface Flaps BGPd (#2865)
* The function bgp_router_id_zebra_bump() will check for active bgp
  peers before chenging the router ID.
  If there are established peers, router ID is not modified
  which prevents the flapping of established peer connection

* Added field in bgp structure to store the count of established peers

Signed-off-by: kssoman <somanks@vmware.com>
2018-11-19 04:35:32 -08:00
Don Slice
5742e42b98 bgpd: make name of default vrf/bgp instance consistent
Problems were reported with the name of the default vrf and the
default bgp instance being different, creating confusion.  This
fix changes both to "default" for consistency.

Ticket: CM-21791
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Reviewed-by: CCR-7658
Testing: manual testing and automated tests before pushing
2018-10-31 06:20:37 -04:00
David Lamparter
0437e10517 *: spelchek
Signed-off-by: David Lamparter <equinox@diac24.net>
2018-10-25 20:10:57 +02:00
Donald Sharp
19bd3dffc1 bgpd: Do a bit better job of tracking the bgp->peerhash
When we add/remove peers we need to do a bit better job
of tracking them in the bgp->peerhash.

1) When we have the doppelganger take over, make sure the
winner is the one represented in the peerhash.

2) When creating the doppelganger, leave the current one
in place instead of blindly replacing it.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-10-07 20:55:52 -04:00
Donald Sharp
9bf904cc8b bgpd: Try to notice when configuration changes during startup
During peer startup there exists the possibility that both
locally and remote peers try to start communication at the
same time.  In addition it is possible for local configuration
to change at the same time this is going on.  When this happens
try to notice that the remote peer may be in opensent or openconfirm
and if so we need to restart the connection from both sides.

Additionally try to write a bit of extra code in peer_xfer_conn
to notice when this happens and to emit a error message to
the end user about this happening so that it can be cleaned up.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-10-01 10:58:06 -04:00
Quentin Young
1c50c1c0d6 *: style for EC replacements
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2018-09-13 19:38:57 +00:00
Quentin Young
450971aa99 *: LIB_[ERR|WARN] -> EC_LIB
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2018-09-13 19:34:28 +00:00
Quentin Young
e50f7cfdbd bgpd: BGP_[WARN|ERR] -> EC_BGP
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2018-09-13 18:51:04 +00:00
Quentin Young
09c866e34d *: rename ferr_zlog -> flog_err_sys
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2018-08-14 20:02:05 +00:00
Quentin Young
af4c27286d *: rename zlog_fer -> flog_err
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2018-08-14 20:02:05 +00:00
Donald Sharp
02705213b1 bgpd: Convert to using LIB_ERR_XXX where possible
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-08-14 20:02:05 +00:00
Don Slice
14454c9fdd bgpd: implement zlog_ferr facility for enhance error messages in bgp
Signed-off-by: Don Slice <dslice@cumulusnetworks.com<
2018-08-14 20:02:05 +00:00
Pascal Mathis
b90a8e13ee
bgpd: Implement group-overrides for peer timers
This commit implements BGP peer-group overrides for the timer flags,
which control the value of the hold, keepalive, advertisement-interval
and connect connect timers. It was kept separated on purpose as the
whole timer implementation is quite complex and merging this commit
together with with the other flag implementations did not seem right.

Basically three new peer flags were introduced, namely
*PEER_FLAG_ROUTEADV*, *PEER_FLAG_TIMER* and *PEER_FLAG_TIMER_CONNECT*.
The overrides work exactly the same way as they did before, but
introducing these flags made a few conditionals simpler as they no
longer had to compare internal data structures against eachother.

Last but not least, the test suite has been adjusted accordingly to test
the newly implemented flag overrides.

Signed-off-by: Pascal Mathis <mail@pascalmathis.com>
2018-06-14 18:55:30 +02:00
Donald Sharp
c42eab4bf5 bgpd: Respect ability to reach nexthop if available
When bgp is thinking about opening a connection to a peer,
if we are connected to zebra, allow that to influence our
decision to start the connection.

Found Scenario:

Both bgp and zebra are started up at the same time.  Zebra is
being used to create the connected route through which bgp
will establish a peering relationship.  The machine is a
bit loaded due to other startup conditions and as such bgp
gets to the connection stage here before zebra has installed
the route.  If bgp does not respect zebra data when it does
have a connection then we will attempt to connect.  The
connect will fail because there is no route.  At that time
we will go into the connect timeout(2 minutes) and delay
connection.

What this does.  If we have established a zebra connection and
we do not have a clear path to the destination at this point
do not allow the connection to proceed.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-05-11 07:46:43 -04:00
Donald Sharp
54ff5e9b02 bgpd: Cleanup messages from getsockopt
The handling of the return codes for getsockopt was slightly wrong.

getsockopt returns -1 on error and errno is set.
What to do with the return code at that point is dependent
on what sockopt you are asking about.  In this case
status holds the error returned for SO_ERROR.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-05-11 07:34:24 -04:00
G. Paul Ziemba
960035b2d9 bgpd: nexthop tracking with labels for vrf-vpn leaking
Routes that have labels must be sent via a nexthop that also has labels.
This change notes whether any path in a nexthop update from zebra contains
labels. If so, then the nexthop is valid for routes that have labels.

If a nexthop update has no labeled paths, then any labeled routes
referencing the nexthop are marked not valid.

Add a route flag BGP_INFO_ANNC_NH_SELF that means "advertise myself
as nexthop when announcing" so that we can track our notion of the
nexthop without revealing it to peers.

Signed-off-by: G. Paul Ziemba <paulz@labn.net>
2018-04-04 10:00:23 -07:00
Quentin Young
d7c0a89a3a
*: use C99 standard fixed-width integer types
The following types are nonstandard:
- u_char
- u_short
- u_int
- u_long
- u_int8_t
- u_int16_t
- u_int32_t

Replace them with the C99 standard types:
- uint8_t
- unsigned short
- unsigned int
- unsigned long
- uint8_t
- uint16_t
- uint32_t

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2018-03-27 15:13:34 -04:00
Donald Sharp
5410015a79 bgpd: peer->bgp must be non NULL
We lock and set peer->bgp at peer creation and only
remove it at deletion.  Therefore these tests are
not needed.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2018-03-20 19:09:06 -04:00
Lou Berger
996c93142d *: conform with COMMUNITY.md formatting rules, via 'make indent'
Signed-off-by: Lou Berger <lberger@labn.net>
2018-03-06 14:04:32 -05:00
Philippe Guibert
f62abc7d65 bgpd: do not start BGP VRF peer connection, if VRF not unknown
Upon starting a BGP VRF instance, the server socket is not created,
because the VRF ID is not known, and then underlying VRF backend is not
ready yet. Because of that, the peer connection attempt will not be
started before.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2018-02-27 11:11:24 +01:00
Philippe Guibert
61cf4b3715 bgpd: bgp support for netns
The change contained in this commit does the following:
- discovery of vrf id from zebra daemon, and adaptation of bgp contexts
  with BGP.
  The list of network addresses contain a reference to the bgp context
  supporting the vrf.
  The bgp context contains a vrf pointer that gives information about
  the netns path in case the vrf is a netns path.

Only some contexts are impacted, namely socket creation, and retrieval
of local IP settings. ( this requires vrf identifier).

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2018-02-27 11:11:24 +01:00
Russ White
2ed7e4c3c3
Merge pull request #1591 from qlyoung/bgpd-ringbuf
bgpd: use ring buffer for network input
2018-01-10 19:59:24 -05:00
Quentin Young
0112e9e0b9
bgpd: use atomic_* ops on _Atomic variables
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2018-01-09 15:40:48 -05:00
Quentin Young
74ffbfe6fe
bgpd: use ring buffer for network input
The multithreading code has a comment that reads:
"XXX: Heavy abuse of stream API. This needs a ring buffer."

This patch makes the relevant code use a ring buffer.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2018-01-03 14:35:11 -05:00
Quentin Young
7a86aa5a0a
bgpd: schedule packet job after connection xfer
During initial session establishment, bgpd performs a "connection
transfer" to a new peer struct if the connection was initiated passively
(i.e. by the remote peer). With the addition of buffered input and a
reorganized packet processor, the following race condition manifests:

1. Remote peer initiates a connection. After exchanging OPEN messages,
   we send them a KEEPALIVE. They send us a KEEPALIVE followed by
   10,000 UPDATE messages. The I/O thread pushes these onto our local
   peer's input buffer and schedules a packet processing job on the
   main thread.
2. The packet job runs and processes the KEEPALIVE, which completes the
   handshake on our end. As part of transferring to ESTABLISHED we
   transfer all peer state to a new struct, as mentioned. Upon returning
   from the KEEPALIVE processing routing, the peer context we had has
   now been destroyed. We notice this and stop processing. Meanwhile
   10k UPDATE messages are sitting on the input buffer.
3. N seconds later, the remote peer sends us a KEEPALIVE. The I/O thread
   schedules another process job, which finds 10k UPDATEs waiting for
   it. Convergence is achieved, but has been delayed by the value of the
   KEEPALIVE timer.

The racey part is that if the remote peer takes a little bit of time to
send UPDATEs after KEEPALIVEs -- somewhere on the order of a few hundred
milliseconds -- we complete the transfer successfully and the packet
processing job is scheduled on the new peer upon arrival of the UPDATE
messages. Yuck.

The solution is to schedule a packet processing job on the new peer
struct after transferring state.

Lengthy commit message in case someone has to debug similar problems in
the future...

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-30 16:18:05 -05:00
Quentin Young
7db44ec8fa
bgpd: transfer raw input buffer to new peer
During initial session establishment, bgpd performs a "connection
transfer" to a new peer struct if the connection was initiated passively
(i.e. by the remote peer). With the addition of buffered input, I forgot
to transfer the raw input buffer to the new peer. This resulted in
infrequent failures during session handshaking whereby half of a packet
would be thrown away in the middle of a read causing us to send a NOTIFY
for an unsynchronized header. Usually the transfer coincided with a
clean input buffer, hence why it only showed up once in a while.
2017-11-30 16:18:05 -05:00
Quentin Young
387f984e58
bgpd: fix bgp active open
At some point when rearranging FSM code, bgpd lost the ability to
perform active opens because it was only paying attention to POLLIN and
not POLLOUT, when the latter is used to signify a successful connection
in the active case.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-30 16:18:05 -05:00
Quentin Young
becedef6c3
bgpd, tests: comment formatting
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-30 16:18:05 -05:00
Quentin Young
bea0122657
bgpd: misc fsm fixes
* Keepalive on/off calls are necessary in certain cases due to screwy
  fsm flow not turning them on after transferring a passive peer
  connection in peer_xfer_conn

* Missed a case bgp_event_update() that resulted in a return code of -1
  instead of BGP_Stop, which confuses the packet processing routine

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-30 16:18:02 -05:00
Quentin Young
d815168795
bgpd: fix bgp_packet.c / bgp_fsm.c organization
Despaghettification of bgp_packet.c and bgp_fsm.c

Sometimes we call bgp_event_update() inline packet parsing.
Sometimes we post events instead.
Sometimes we increment packet counters in the FSM.
Sometimes we do it in packet routines.
Sometimes we update EOR's in FSM.
Sometimes we do it in packet routines.

Fix the madness.

bgp_process_packet() is now the centralized place to:
- Update message counters
- Execute FSM events in response to incoming packets

FSM events are now executed directly from this function instead of being
queued on the thread_master. This is to ensure that the FSM contains the
proper state after each packet is parsed. Otherwise there could be race
conditions where two packets are parsed in succession without the
appropriate FSM update in between, leading to session closure due to
receiving inappropriate messages for the current FSM state.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-30 16:18:02 -05:00
Quentin Young
a9794991c7
bgpd: bye bye THREAD_BACKGROUND
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-30 16:18:01 -05:00
Quentin Young
9eb217ff69
bgpd: batched i/o
Instead of reading a packet header and the rest of the packet in two
separate i/o cycles, instead read a chunk of data at one time and then
parse as many packets as possible out of the chunk.

Also changes bgp_packet.c to batch process packets.

To avoid thrashing on useless mutex locks, the scheduling call for
bgp_process_packet has been changed to always succeed at the cost of no
longer being cancel-able. In this case this is acceptable; following the
pattern of other event-based callbacks, an additional check in
bgp_process_packet to ignore stray events is sufficient. Before deleting
the peer all events are cleared which provides the requisite ordering.

XXX: chunk hardcoded to 5, should use something similar to wpkt_quanta

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-30 16:18:00 -05:00
Quentin Young
b72b6f4fc9
bgpd: rename peer_keepalives* --> bgp_keepalives*
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-30 16:17:59 -05:00
Quentin Young
424ab01d0f
bgpd: implement buffered reads
* Move and modify all network input related code to bgp_io.c
* Add a real input buffer to `struct peer`
* Move connection initialization to its own thread.c task instead of
  piggybacking off of bgp_read()
* Tons of little fixups

Primary changes are in bgp_packet.[ch], bgp_io.[ch], bgp_fsm.[ch].
Changes made elsewhere are almost exclusively refactoring peer->ibuf to
peer->curr since peer->ibuf is now the true FIFO packet input buffer
while peer->curr represents the packet currently being processed by the
main pthread.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-30 16:17:59 -05:00
Quentin Young
56257a44e4
bgpd: move bgp i/o to a separate source file
After implement threading, bgp_packet.c was serving the double purpose
of consolidating packet parsing functionality and handling actual I/O
operations. This is somewhat messy and difficult to understand. I've
thus moved all code and data structures for handling threaded packet
writes to bgp_io.[ch].

Although bgp_io.[ch] only handles writes at the moment to keep the noise
on this commit series down, for organization purposes, it's probably
best to move bgp_read() and its trappings into here as well and
restructure that code so that read()'s happen in the pthread and packet
processing happens on the main thread.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-30 16:17:59 -05:00
Quentin Young
dc1188bb4d
bgpd: correctly schedule select() at session startup
On TCP connection failure during session setup, bgp_stop() checks
whether peer->t_read is non-null to know whether or not to unschedule
select() on peer->fd before calling close() on it. Using the API exposed
by thread.c instead of bgpd's wrapper macro BGP_READ_ON() results in
this thread value never being set, which causes bgp_stop() to skip the
cancellation of select() before calling close(). Subsequent calls to
select() on that fd crash the daemon.

Use the macro instead.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-30 16:17:58 -05:00
Quentin Young
727c4f870b
bgpd: transfer packets from peer stub to actual peer
During transition from OpenConfirm -> Established, we wipe the peer stub's
output buffer. Because thread.c prioritizes I/O operations over regular
background threads and events, in a single threaded environment this ordering
meant that the output buffer would be happily empty at wipe time.  In MT-land,
this convenient coincidence is no longer true; thus we need to make sure that
any packets remaining on the peer stub get transferred over to the peer proper.

Also removes misleading comment indicating that bgp_establish() sends a
keepalive packet. It does not.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-30 16:17:58 -05:00
Quentin Young
03014d48f4
bgpd: put BGP keepalives in a pthread
This patch, in tandem with moving packet writes into a dedicated kernel
thread, fixes session flaps caused by long-running internal operations
starving the (old) userspace write thread.

BGP keepalives are now produced by a kernel thread and placed onto the
peer's output queue. These are then consumed by the write thread. Both
of these tasks are concurrent with the rest of bgpd, obviating the
session flaps described above.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-30 16:17:57 -05:00
Quentin Young
07a1652682
bgpd: move bgp_connect_check() to bgp_fsm.c
Prior to this change, after initiating a nonblocking connection to the
remote peer bgpd would call both BGP_READ_ON and BGP_WRITE_ON on the
peer's socket. This resulted in a call to select(), so that when some
event (either a connection success or failure) occurred on the socket,
one of bgp_read() or bgp_write() would run. At the beginning of each of
those functions was a hook into bgp_connect_check(), which checked the
socket status and issued the correct connection event onto the BGP FSM.

This code is better suited for bgp_fsm.c. Placing it there avoids
scheduling packet reads or writes when we don't know if the socket has
established a connection yet, and the specific functionality is a better
fit for the responsibility scope of this unit.

This change also helps isolate the responsibilities of the
packet-writing kernel thread.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-30 16:17:57 -05:00
Quentin Young
d3ecc69e5f
bgpd: move packet writes into dedicated pthread
* BGP_WRITE_ON() removed
* BGP_WRITE_OFF() removed
* peer_writes_on() added
* peer_writes_off() added
* bgp_write_proceed_actions() removed

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-30 16:17:57 -05:00
Quentin Young
05c7a1cc93
bgpd: use FOREACH_AFI_SAFI where possible
Improves consistency and readability.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-11-21 13:02:06 -05:00
Don Slice
d25e4efc52 bgpd: fix various problems with hold/keepalive timers
Problem reported that we weren't adjusting the keepalive timer
correctly when we negotiated a lower hold time learned from a
peer.  While working on this, found we didn't do inheritance
correctly at all.  This fix solves the first problem and also
ensures that the timers are configured correctly based on this
priority order - peer defined > peer-group defined > global config.
This fix also displays the timers as "configured" regardless of
which of the three locations above is used.

Ticket: CM-18408
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Reviewed-by: CCR-6807
Testing-performed:  Manual testing successful, fix tested by
submitter, bgp-smoke completed successfully
2017-10-26 11:55:31 -04:00
Renato Westphal
a08ca0a7e1 lib: remove SAFI_RESERVED_4 and SAFI_RESERVED_5
SAFI values have been a major source of confusion over the last few
years. That's because each SAFI needs to be represented in two different
ways:
* IANA's value used to send/receive packets over the network;
* Internal value used for array indexing.

In the second case, defining reserved values makes no sense because we
don't want to index SAFIs that simply don't exist. The sole purpose of
the internal SAFI values is to remove the gaps we have among the IANA
values, which would represent wasted memory in C arrays. With that said,
remove these reserved SAFIs to avoid further confusion in the future.

Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
2017-07-31 23:38:38 -03:00
David Lamparter
9d303b37d7 Revert "*: reindent pt. 2"
This reverts commit c14777c6bf.

clang 5 is not widely available enough for people to indent with.  This
is particularly problematic when rebasing/adjusting branches.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2017-07-22 14:52:33 +02:00
whitespace / reindent
c14777c6bf
*: reindent pt. 2
w/ clang 5

* reflow comments
* struct members go 1 per line
* binpack algo was adjusted
2017-07-17 15:26:02 -04:00
whitespace / reindent
d62a17aede *: reindent
indent.py `git ls-files | pcregrep '\.[ch]$' | pcregrep -v '^(ldpd|babeld|nhrpd)/'`

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2017-07-17 14:04:07 +02:00
David Lamparter
acd738fc7f *: fix GCC 7 switch/case fallthrough warnings
Need a comment on these.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2017-07-14 16:59:43 +02:00
Quentin Young
56b4067930 *: simplify log message lookup
log.c provides functionality for associating a constant (typically a
protocol constant) with a string and finding the string given the
constant. However this is highly delicate code that is extremely prone
to stack overflows and off-by-one's due to requiring the developer to
always remember to update the array size constant and to do so correctly
which, as shown by example, is never a good idea.b

The original goal of this code was to try to implement lookups in O(1)
time without a linear search through the message array. Since this code
is used 99% of the time for debugs, it's worth the 5-6 additional cmp's
worst case if it means we avoid explitable bugs due to oversights...

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-06-21 15:22:21 +00:00
David Lamparter
57463530f3 Merge branch 'stable/3.0'
Conflicts:
	ospf6d/ospf6_lsa.c
	ospfd/ospf_vty.c
	zebra/interface.c

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2017-05-18 12:28:12 +02:00
David Lamparter
92eedda1fb Merge branch stable/2.0 into stable/3.0
Conflicts:
	bgpd/bgp_fsm.c
	ospf6d/ospf6_lsa.c
	ospfd/ospf_vty.c
	zebra/redistribute.c

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2017-05-18 12:23:13 +02:00
Donald Sharp
b9796a6e01 bgpd: Fix vrf crash
Ensure that we have a valid vrf before we log
information about it.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
2017-05-17 08:48:46 -04:00
Donald Sharp
8c51cac02a bgpd: Fix ADJCHANGE message to include more info
When bgp logs ADJCHANGE messages include the
hostname and vrf that this change is being made
in.

Ticket: CM-10922
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2017-05-17 08:48:46 -04:00
Donald Sharp
a6e895a9df Merge remote-tracking branch 'origin/stable/2.0' 2017-05-17 08:32:53 -04:00
Don Slice
24de86bc6b Merge branch 'stable/2.0' into bgp-fixes 2017-05-17 07:38:59 -04:00
Don Slice
2e37f307ee bgpd: fix crash in bgp_stop due to missing vrf
Problem found to be derefencing a vrf that had already been deleted.  Fix
verifies that vrf exists before using it.

Ticket: CM-13682
Signed-off-by: Don Slice
Reviewed By: Vivek Venkatraman
Testing Done: manual testing, re-run of failing scripts good
2017-05-16 16:22:38 -04:00
Donald Sharp
d32dfc2201 bgpd: Fix ADJCHANGE message to include more info
When bgp logs ADJCHANGE messages include the
hostname and vrf that this change is being made
in.

Ticket: CM-10922
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2017-05-16 16:17:10 -04:00
Donald Sharp
c2f6134436 bgpd: Fix vrf crash
Ensure that we have a valid vrf before we log
information about it.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
2017-05-16 15:25:53 -04:00
Donald Sharp
c22767d89e bgpd: Fix ADJCHANGE message to include more info
When bgp logs ADJCHANGE messages include the
hostname and vrf that this change is being made
in.

Ticket: CM-10922
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2017-05-16 15:10:33 -04:00
David Lamparter
896014f4bc *: make consistent & update GPLv2 file headers
The FSF's address changed, and we had a mixture of comment styles for
the GPL file header.  (The style with * at the beginning won out with
580 to 141 in existing files.)

Note: I've intentionally left intact other "variations" of the copyright
header, e.g. whether it says "Zebra", "Quagga", "FRR", or nothing.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2017-05-15 16:37:41 +02:00
Quentin Young
ffa2c8986d *: remove THREAD_ON macros, add nullity check
The way thread.c is written, a caller who wishes to be able to cancel a
thread or avoid scheduling it twice must keep a reference to the thread.
Typically this is done with a long lived pointer whose value is checked
for null in order to know if the thread is currently scheduled.  The
check-and-schedule idiom is so common that several wrapper macros in
thread.h existed solely to provide it.

This patch removes those macros and adds a new parameter to all
thread_add_* functions which is a pointer to the struct thread * to
store the result of a scheduling call. If the value passed is non-null,
the thread will only be scheduled if the value is null. This helps with
consistency.

A Coccinelle spatch has been used to transform code of the form:

  if (t == NULL)
    t = thread_add_* (...)

to the form

  thread_add_* (..., &t)

The THREAD_ON macros have also been transformed to the underlying
thread.c calls.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-05-09 20:44:19 +00:00
Don Slice
cd1964ff38 bgpd: labeled unicast processing
Implement support for negotiating IPv4 or IPv6 labeled-unicast address
family, exchanging prefixes and installing them in the routing table, as
well as interactions with Zebra for FEC registration. This is the
implementation of RFC 3107.

Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
2017-04-06 10:32:07 -04:00
David Lamparter
3012671ffa *: use hooks for sending SNMP traps
This means there are no ties into the SNMP code anymore other than the
init call at startup.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2017-03-25 08:52:36 +01:00
Quentin Young
e9e4c4f8b0 bgpd: remove unnecessary #include "vty.h"
Per previous commit, these are no longer necessary.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
2017-03-02 02:09:00 +00:00
Julien Courtat
4d5b4f7bd9 bgpd: graceful restart for vpnv4 address family
This patch enable the support of graceful restart for routes sets with
vpnv4 address family format. In this specific case, data model is
slightly different and some additional processing must be done when
accessing bgp tables and nodes.
The clearing stale algorithm takes into account the specificity where
the 2 node level for MPLS has to be reached.

Signed-off-by: Julien Courtat <julien.courtat@6wind.com>
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
2017-01-04 18:02:04 +01:00
David Lamparter
53dc2b05c7 Merge branch 'stable/2.0'
Conflicts:
	bgpd/bgp_route.c
	lib/if.c
	ripd/rip_interface.c
	zebra/interface.c
	zebra/zebra_vty.c
2016-12-05 19:48:38 +01:00
Renato Westphal
658bbf6d70 bgpd: optimize copy of strings on peer_xfer_conn()
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
2016-11-28 16:15:27 -02:00
Quentin Young
e52702f29d Merge branch 'cmaster-next' into vtysh-grammar
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>

Conflicts:
	bgpd/bgp_route.c
	bgpd/bgp_routemap.c
	bgpd/bgp_vty.c
	isisd/isis_redist.c
	isisd/isis_routemap.c
	isisd/isis_vty.c
	isisd/isisd.c
	lib/command.c
	lib/distribute.c
	lib/if.c
	lib/keychain.c
	lib/routemap.c
	lib/routemap.h
	ospf6d/ospf6_asbr.c
	ospf6d/ospf6_interface.c
	ospf6d/ospf6_neighbor.c
	ospf6d/ospf6_top.c
	ospf6d/ospf6_zebra.c
	ospf6d/ospf6d.c
	ospfd/ospf_routemap.c
	ospfd/ospf_vty.c
	ripd/rip_routemap.c
	ripngd/ripng_routemap.c
	vtysh/extract.pl.in
	vtysh/vtysh.c
	zebra/interface.c
	zebra/irdp_interface.c
	zebra/rt_netlink.c
	zebra/rtadv.c
	zebra/test_main.c
	zebra/zebra_routemap.c
	zebra/zebra_vty.c
2016-10-17 23:36:21 +00:00
Daniel Walton
1ba2a97af9 bgpd: 'Last write' does not update when we TX a keepalive
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com>
Reviewed-by:   Donald Sharp <sharpd@cumulusnetworks.com>

Ticket: CM-5518
2016-10-06 13:20:02 +00:00
Daniel Walton
4dcadbefd0 bgpd: argv update for all but bgp_vty.c
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com>
2016-09-22 15:15:50 +00:00
Donald Sharp
4d41dd8ba2 bgpd: Revert --enable-bgp-standalone
Reverts the --enable-bgp-standalone and makes it so that you
need to use --enable-cumulus to get the cumulus behavior.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2016-09-20 07:57:41 -04:00
David Lamparter
4a1ab8e405 *: split & distribute memtypes and stop (re|ab)using lib/ MTYPEs
This is a rather large mechanical commit that splits up the memory types
defined in lib/memtypes.c and distributes them into *_memory.[ch] files
in the individual daemons.

The zebra change is slightly annoying because there is no nice place to
put the #include "zebra_memory.h" statement.

bgpd, ospf6d, isisd and some tests were reusing MTYPEs defined in the
library for its own use.  This is bad practice and would break when the
memtype are made static.

Acked-by: Vincent JARDIN <vincent.jardin@6wind.com>
Acked-by: Donald Sharp <sharpd@cumulusnetworks.com>
[CF: rebased for cmaster-next]
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Signed-off-by: Christian Franke <chris@opensourcerouting.org>
2016-09-19 16:31:04 -04:00
Donald Sharp
b5826a12a2 bgpd: Allow bgp to work standalone
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
2016-09-08 18:48:02 -04:00
Donald Sharp
039f3a3495 lib, bgpd, tests: Refactor FILTER_X in zebra.h
lib/zebra.h has FILTER_X #define's.  These do not belong there.
Put them in lib/filter.h where they belong.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
(cherry picked from commit 0490729cc033a3483fc6b0ed45085ee249cac779)
2016-08-16 11:00:22 -04:00
Paul Jakma
b4575c00ce bgpd: Compile fix for clearing-completion FSM fix, using workqueue helper.
(cherry picked from commit 782fb0770080d0e2970fc63af8645e82543aa4d0)

Conflicts:
	bgpd/bgp_fsm.c
2016-06-06 09:10:39 -07:00
Dinesh G Dutt
e60480bd74 Update last reset reason on interface down or neighbor addr loss.
Ticket:
Reviewed By:
Testing Done:

For interface-based peering, we don't update the reset reason to be
interface down. Similarly, we don't update the reason to be loss of
neighbor address (maybe due to RA loss). This patch addresses these
limitations.
2016-04-25 08:54:44 -07:00
Daniel Walton
f9e9e0736f BGP memory leak in peer hostname
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com>
Reviewed-by:   Donald Sharp <sharpd@cumulusnetworks.com>

Ticket: CM-9786
2016-03-10 03:58:48 +00:00
vivek
ad4cbda1a3 BGP: VRF registration and cleanup
Various changes and fixes related to VRF registration, deletion,
BGP exit etc.

- Define instance type
- Ensure proper handling upon instance create, delete and
  VRF add/delete from zebra
- Cleanup upon bgp_exit()
- Ensure messages are not sent to zebra for unknown VRFs

Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Reviewed-by:   Donald Sharp <sharpd@cumulusnetworks.com>

Ticket: CM-9128, CM-7203
Reviewed By: CCR-4098
Testing Done: Manual
2016-02-12 13:50:22 -08:00
Daniel Walton
2a3d57318c BGP: route-server will now use addpath...chop the _rsclient code
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com>
Reviewed-by:   Donald Sharp <sharpd@cumulusnetworks.com>

Ticket: CM-8122

per draft-ietf-idr-ix-bgp-route-server-09:

2.3.2.2.2.  BGP ADD-PATH Approach

   The [I-D.ietf-idr-add-paths] Internet draft proposes a different
   approach to multiple path propagation, by allowing a BGP speaker to
   forward multiple paths for the same prefix on a single BGP session.

   As [RFC4271] specifies that a BGP listener must implement an implicit
   withdraw when it receives an UPDATE message for a prefix which
   already exists in its Adj-RIB-In, this approach requires explicit
   support for the feature both on the route server and on its clients.

   If the ADD-PATH capability is negotiated bidirectionally between the
   route server and a route server client, and the route server client
   propagates multiple paths for the same prefix to the route server,
   then this could potentially cause the propagation of inactive,
   invalid or suboptimal paths to the route server, thereby causing loss
   of reachability to other route server clients.  For this reason, ADD-
   PATH implementations on a route server should enforce send-only mode
   with the route server clients, which would result in negotiating
   receive-only mode from the client to the route server.

This allows us to delete all of the following code:

- All XXXX_rsclient() functions
- peer->rib
- BGP_TABLE_MAIN and BGP_TABLE_RSCLIENT
- RMAP_IMPORT and RMAP_EXPORT
2015-11-10 15:29:12 +00:00
Daniel Walton
40d2700de3 BGP ORF fails to filter prefixes correctly
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com>
Reviewed-by:   Donald Sharp <sharpd@cumulusnetworks.com>

Ticket: CM-7145
2015-11-04 16:31:33 +00:00
David Lamparter
c7da3d50b3 lib: straighten out ORF prefix list support
BGP ORF prefix lists are in a separate namespace; this was previously
hooked up with a special-purpose AFI value.  This is a little kludgy for
extension, hence this splits it off.

Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
2015-11-03 05:49:39 -08:00
vivek
085567f955 BGP: Do not get out of bgp_start() if peer's IP address isn't known
Ticket: CM-7140
Reviewed By: CCR-3412
Testing Done: bgpsmoke, Atul verified fix

BGP Unnumbered and Interface based peering can interact in some strange
ways. One of them is when there's an IPv4 address on a link on which
BGP Unnumbered session is beng attempted, but the IPv4 address is not
a /30 or /31. As per the bug report, we end up attempting to start the
BGP FSM on receiving a notification that an IPv4 address is present on
an interface. To avoid attempting to go past BGP's start state in the
absence of a valid peer address is the right thing to do. And this
simple patch does just that.

Signed-off-by: Dinesh G Dutt <ddutt@cumulusnetworks.com>
Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by:   Vipin Kumar <vipin@cumulusnetworks.com>
2015-10-20 22:01:49 -07:00
Daniel Walton
bd4b893f77 Remove BGP's asorig timer, it is no longer used
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com>
Reviewed-by:   Donald Sharp <sharpd@cumulusnetworks.com>
2015-10-20 21:54:07 +00:00