mirror of
				https://git.proxmox.com/git/mirror_frr
				synced 2025-11-04 10:07:04 +00:00 
			
		
		
		
	Just a few paragraphs on what it does and how to invoke it. Signed-off-by: David Lamparter <equinox@diac24.net>
		
			
				
	
	
		
			216 lines
		
	
	
		
			9.0 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
			
		
		
	
	
			216 lines
		
	
	
		
			9.0 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
.. _xrefs:
 | 
						|
 | 
						|
Introspection (xrefs)
 | 
						|
=====================
 | 
						|
 | 
						|
The FRR library provides an introspection facility called "xrefs."  The intent
 | 
						|
is to provide structured access to annotated entities in the compiled binary,
 | 
						|
such as log messages and thread scheduling calls.
 | 
						|
 | 
						|
Enabling and use
 | 
						|
----------------
 | 
						|
 | 
						|
Support for emitting an xref is included in the macros for the specific
 | 
						|
entities, e.g. :c:func:`zlog_info` contains the relevant statements.  The only
 | 
						|
requirement for the system to work is a GNU compatible linker that supports
 | 
						|
section start/end symbols.  (The only known linker on any system FRR supports
 | 
						|
that does not do this is the Solaris linker.)
 | 
						|
 | 
						|
To verify xrefs have been included in a binary or dynamic library, run
 | 
						|
``readelf -n binary``.  For individual object files, it's
 | 
						|
``readelf -S object.o | grep xref_array`` instead.
 | 
						|
 | 
						|
Structure and contents
 | 
						|
----------------------
 | 
						|
 | 
						|
As a slight improvement to security and fault detection, xrefs are divided into
 | 
						|
a ``const struct xref *`` and an optional ``struct xrefdata *``.  The required
 | 
						|
const part contains:
 | 
						|
 | 
						|
.. c:member:: enum xref_type xref.type
 | 
						|
 | 
						|
   Identifies what kind of object the xref points to.
 | 
						|
 | 
						|
.. c:member:: int line
 | 
						|
.. c:member:: const char *xref.file
 | 
						|
.. c:member:: const char *xref.func
 | 
						|
 | 
						|
   Source code location of the xref.  ``func`` will be ``<global>`` for
 | 
						|
   xrefs outside of a function.
 | 
						|
 | 
						|
.. c:member:: struct xrefdata *xref.xrefdata
 | 
						|
 | 
						|
   The optional writable part of the xref.  NULL if no non-const part exists.
 | 
						|
 | 
						|
The optional non-const part has:
 | 
						|
 | 
						|
.. c:member:: const struct xref *xrefdata.xref
 | 
						|
 | 
						|
   Pointer back to the constant part.  Since circular pointers are close to
 | 
						|
   impossible to emit from inside a function body's static variables, this
 | 
						|
   is initialized at startup.
 | 
						|
 | 
						|
.. c:member:: char xrefdata.uid[16]
 | 
						|
 | 
						|
   Unique identifier, see below.
 | 
						|
 | 
						|
.. c:member:: const char *xrefdata.hashstr
 | 
						|
.. c:member:: uint32_t xrefdata.hashu32[2]
 | 
						|
 | 
						|
   Input to unique identifier calculation.  These should encompass all
 | 
						|
   details needed to make an xref unique.  If more than one string should
 | 
						|
   be considered, use string concatenation for the initializer.
 | 
						|
 | 
						|
Both structures can be extended by embedding them in a larger type-specific
 | 
						|
struct, e.g. ``struct xref_logmsg *``.
 | 
						|
 | 
						|
Unique identifiers
 | 
						|
------------------
 | 
						|
 | 
						|
All xrefs that have a writable ``struct xrefdata *`` part are assigned an
 | 
						|
unique identifier, which is formed as base32 (crockford) SHA256 on:
 | 
						|
 | 
						|
- the source filename
 | 
						|
- the ``hashstr`` field
 | 
						|
- the ``hashu32`` fields
 | 
						|
 | 
						|
.. note::
 | 
						|
 | 
						|
   Function names and line numbers are intentionally not included to allow
 | 
						|
   moving items within a file without affecting the identifier.
 | 
						|
 | 
						|
For running executables, this hash is calculated once at startup.  When
 | 
						|
directly reading from an ELF file with external tooling, the value must be
 | 
						|
calculated when necessary.
 | 
						|
 | 
						|
The identifiers have the form ``AXXXX-XXXXX`` where ``X`` is
 | 
						|
``0-9, A-Z except I,L,O,U`` and ``A`` is ``G-Z except I,L,O,U`` (i.e. the
 | 
						|
identifiers always start with a letter.)  When reading identifiers from user
 | 
						|
input, ``I`` and ``L`` should be replaced with ``1`` and ``O`` should be
 | 
						|
replaced with ``0``.  There are 49 bits of entropy in this identifier.
 | 
						|
 | 
						|
Underlying machinery
 | 
						|
--------------------
 | 
						|
 | 
						|
Xrefs are nothing other than global variables with some extra glue to make
 | 
						|
them possible to find from the outside by looking at the binary.  The first
 | 
						|
non-obvious part is that they can occur inside of functions, since they're
 | 
						|
defined as ``static``.  They don't have a visible name -- they don't need one.
 | 
						|
 | 
						|
To make finding these variables possible, another global variable, a pointer
 | 
						|
to the first one, is created in the same way.  However, it is put in a special
 | 
						|
ELF section through ``__attribute__((section("xref_array")))``.  This is the
 | 
						|
section you can see with readelf.
 | 
						|
 | 
						|
Finally, on the level of a whole executable or library, the linker will stuff
 | 
						|
the individual pointers consecutive to each other since they're in the same
 | 
						|
section — hence the array.  Start and end of this array is given by the
 | 
						|
linker-autogenerated ``__start_xref_array`` and ``__stop_xref_array`` symbols.
 | 
						|
Using these, both a constructor to run at startup as well as an ELF note are
 | 
						|
created.
 | 
						|
 | 
						|
The ELF note is the entrypoint for externally retrieving xrefs from a binary
 | 
						|
without having to run it.  It can be found by walking through the ELF data
 | 
						|
structures even if the binary has been fully stripped of debug and section
 | 
						|
information.  SystemTap's SDT probes & LTTng's trace points work in the same
 | 
						|
way (though they emit 1 note for each probe, while xrefs only emit one note
 | 
						|
in total which refers to the array.)  Using xrefs does not impact SystemTap
 | 
						|
or LTTng, the notes have identifiers they can be distinguished by.
 | 
						|
 | 
						|
The ELF structure of a linked binary (library or executable) will look like
 | 
						|
this::
 | 
						|
 | 
						|
  $ readelf --wide -l -n lib/.libs/libfrr.so
 | 
						|
 | 
						|
  Elf file type is DYN (Shared object file)
 | 
						|
  Entry point 0x67d21
 | 
						|
  There are 12 program headers, starting at offset 64
 | 
						|
 | 
						|
  Program Headers:
 | 
						|
    Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
 | 
						|
    PHDR           0x000040 0x0000000000000040 0x0000000000000040 0x0002a0 0x0002a0 R   0x8
 | 
						|
    INTERP         0x125560 0x0000000000125560 0x0000000000125560 0x00001c 0x00001c R   0x10
 | 
						|
        [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
 | 
						|
    LOAD           0x000000 0x0000000000000000 0x0000000000000000 0x02aff0 0x02aff0 R   0x1000
 | 
						|
    LOAD           0x02b000 0x000000000002b000 0x000000000002b000 0x0b2889 0x0b2889 R E 0x1000
 | 
						|
    LOAD           0x0de000 0x00000000000de000 0x00000000000de000 0x070048 0x070048 R   0x1000
 | 
						|
    LOAD           0x14e428 0x000000000014f428 0x000000000014f428 0x00fb70 0x01a2b8 RW  0x1000
 | 
						|
    DYNAMIC        0x157a40 0x0000000000158a40 0x0000000000158a40 0x000270 0x000270 RW  0x8
 | 
						|
    NOTE           0x0002e0 0x00000000000002e0 0x00000000000002e0 0x00004c 0x00004c R   0x4
 | 
						|
    TLS            0x14e428 0x000000000014f428 0x000000000014f428 0x000000 0x000008 R   0x8
 | 
						|
    GNU_EH_FRAME   0x12557c 0x000000000012557c 0x000000000012557c 0x00819c 0x00819c R   0x4
 | 
						|
    GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0x10
 | 
						|
    GNU_RELRO      0x14e428 0x000000000014f428 0x000000000014f428 0x009bd8 0x009bd8 R   0x1
 | 
						|
 | 
						|
  (...)
 | 
						|
 | 
						|
  Displaying notes found in: .note.gnu.build-id
 | 
						|
    Owner                Data size 	Description
 | 
						|
    GNU                  0x00000014	NT_GNU_BUILD_ID (unique build ID bitstring)	    Build ID: 6a1f66be38b523095ebd6ec13cc15820cede903d
 | 
						|
 | 
						|
  Displaying notes found in: .note.FRR
 | 
						|
    Owner                Data size 	Description
 | 
						|
    FRRouting            0x00000010	Unknown note type: (0x46455258)	   description data: 6c eb 15 00 00 00 00 00 74 ec 15 00 00 00 00 00
 | 
						|
 | 
						|
Where 0x15eb6c…0x15ec74 are the offsets (relative to the note itself) where
 | 
						|
the xref array is in the file.  Also note the owner is clearly marked as
 | 
						|
"FRRouting" and the type is "XREF" in hex.
 | 
						|
 | 
						|
For SystemTap's use of ELF notes, refer to
 | 
						|
https://libstapsdt.readthedocs.io/en/latest/how-it-works/internals.html as an
 | 
						|
entry point.
 | 
						|
 | 
						|
.. note::
 | 
						|
 | 
						|
   Due to GCC bug 41091, the "xref_array" section is not correctly generated
 | 
						|
   for C++ code when compiled by GCC.  A workaround is present for runtime
 | 
						|
   functionality, but to extract the xrefs from a C++ source file, it needs
 | 
						|
   to be built with clang (or a future fixed version of GCC) instead.
 | 
						|
 | 
						|
Extraction tool
 | 
						|
---------------
 | 
						|
 | 
						|
The FRR source contains a matching tool to extract xref data from compiled ELF
 | 
						|
binaries in ``python/xrelfo.py``.  This tool uses CPython extensions
 | 
						|
implemented in ``clippy`` and must therefore be executed with that.
 | 
						|
 | 
						|
``xrelfo.py`` processes input from one or more ELF file (.o, .so, executable),
 | 
						|
libtool object (.lo, .la, executable wrapper script) or JSON (output from
 | 
						|
``xrelfo.py``) and generates an output JSON file.  During standard FRR build,
 | 
						|
it is invoked on all binaries and libraries and the result is combined into
 | 
						|
``frr.json``.
 | 
						|
 | 
						|
ELF files from any operating system, CPU architecture and endianness can be
 | 
						|
processed on any host.  Any issues with this are bugs in ``xrelfo.py``
 | 
						|
(or clippy's ELF code.)
 | 
						|
 | 
						|
``xrelfo.py`` also performs some sanity checking, particularly on log
 | 
						|
messages.  The following options are available:
 | 
						|
 | 
						|
.. option:: -o OUTPUT
 | 
						|
 | 
						|
   Filename to write JSON output to.  As a convention, a ``.xref`` filename
 | 
						|
   extension is used.
 | 
						|
 | 
						|
.. option:: -Wlog-format
 | 
						|
 | 
						|
   Performs extra checks on log message format strings, particularly checks
 | 
						|
   for ``\t`` and ``\n`` characters (which should not be used in log messages).
 | 
						|
 | 
						|
.. option:: -Wlog-args
 | 
						|
 | 
						|
   Generates cleanup hints for format string arguments where
 | 
						|
   :c:func:`printfrr()` extensions could be used, e.g. replacing ``inet_ntoa``
 | 
						|
   with ``%pI4``.
 | 
						|
 | 
						|
.. option:: --profile
 | 
						|
 | 
						|
   Runs the Python profiler to identify hotspots in the ``xrelfo.py`` code.
 | 
						|
 | 
						|
``xrelfo.py`` uses information about C structure definitions saved in
 | 
						|
``python/xrefstructs.json``.  This file is included with the FRR sources and
 | 
						|
only needs to be regenerated when some of the ``struct xref_*`` definitions
 | 
						|
are changed (which should be almost never).  The file is written by
 | 
						|
``python/tiabwarfo.py``, which uses ``pahole`` to extract the necessary data
 | 
						|
from DWARF information.
 |