Skip to content
Snippets Groups Projects
  1. Oct 21, 2021
  2. Jul 15, 2021
    • Alexei Starovoitov's avatar
      bpf: Introduce bpf timers. · b00628b1
      Alexei Starovoitov authored
      
      Introduce 'struct bpf_timer { __u64 :64; __u64 :64; };' that can be embedded
      in hash/array/lru maps as a regular field and helpers to operate on it:
      
      // Initialize the timer.
      // First 4 bits of 'flags' specify clockid.
      // Only CLOCK_MONOTONIC, CLOCK_REALTIME, CLOCK_BOOTTIME are allowed.
      long bpf_timer_init(struct bpf_timer *timer, struct bpf_map *map, int flags);
      
      // Configure the timer to call 'callback_fn' static function.
      long bpf_timer_set_callback(struct bpf_timer *timer, void *callback_fn);
      
      // Arm the timer to expire 'nsec' nanoseconds from the current time.
      long bpf_timer_start(struct bpf_timer *timer, u64 nsec, u64 flags);
      
      // Cancel the timer and wait for callback_fn to finish if it was running.
      long bpf_timer_cancel(struct bpf_timer *timer);
      
      Here is how BPF program might look like:
      struct map_elem {
          int counter;
          struct bpf_timer timer;
      };
      
      struct {
          __uint(type, BPF_MAP_TYPE_HASH);
          __uint(max_entries, 1000);
          __type(key, int);
          __type(value, struct map_elem);
      } hmap SEC(".maps");
      
      static int timer_cb(void *map, int *key, struct map_elem *val);
      /* val points to particular map element that contains bpf_timer. */
      
      SEC("fentry/bpf_fentry_test1")
      int BPF_PROG(test1, int a)
      {
          struct map_elem *val;
          int key = 0;
      
          val = bpf_map_lookup_elem(&hmap, &key);
          if (val) {
              bpf_timer_init(&val->timer, &hmap, CLOCK_REALTIME);
              bpf_timer_set_callback(&val->timer, timer_cb);
              bpf_timer_start(&val->timer, 1000 /* call timer_cb2 in 1 usec */, 0);
          }
      }
      
      This patch adds helper implementations that rely on hrtimers
      to call bpf functions as timers expire.
      The following patches add necessary safety checks.
      
      Only programs with CAP_BPF are allowed to use bpf_timer.
      
      The amount of timers used by the program is constrained by
      the memcg recorded at map creation time.
      
      The bpf_timer_init() helper needs explicit 'map' argument because inner maps
      are dynamic and not known at load time. While the bpf_timer_set_callback() is
      receiving hidden 'aux->prog' argument supplied by the verifier.
      
      The prog pointer is needed to do refcnting of bpf program to make sure that
      program doesn't get freed while the timer is armed. This approach relies on
      "user refcnt" scheme used in prog_array that stores bpf programs for
      bpf_tail_call. The bpf_timer_set_callback() will increment the prog refcnt which is
      paired with bpf_timer_cancel() that will drop the prog refcnt. The
      ops->map_release_uref is responsible for cancelling the timers and dropping
      prog refcnt when user space reference to a map reaches zero.
      This uref approach is done to make sure that Ctrl-C of user space process will
      not leave timers running forever unless the user space explicitly pinned a map
      that contained timers in bpffs.
      
      bpf_timer_init() and bpf_timer_set_callback() will return -EPERM if map doesn't
      have user references (is not held by open file descriptor from user space and
      not pinned in bpffs).
      
      The bpf_map_delete_elem() and bpf_map_update_elem() operations cancel
      and free the timer if given map element had it allocated.
      "bpftool map update" command can be used to cancel timers.
      
      The 'struct bpf_timer' is explicitly __attribute__((aligned(8))) because
      '__u64 :64' has 1 byte alignment of 8 byte padding.
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Acked-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Acked-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Link: https://lore.kernel.org/bpf/20210715005417.78572-4-alexei.starovoitov@gmail.com
      b00628b1
  3. Mar 05, 2021
  4. Dec 04, 2020
  5. Nov 25, 2020
  6. Nov 18, 2020
  7. Oct 29, 2020
  8. Oct 21, 2020
  9. Sep 29, 2020
    • Alan Maguire's avatar
      bpf: Add bpf_snprintf_btf helper · c4d0bfb4
      Alan Maguire authored
      
      A helper is added to support tracing kernel type information in BPF
      using the BPF Type Format (BTF).  Its signature is
      
      long bpf_snprintf_btf(char *str, u32 str_size, struct btf_ptr *ptr,
      		      u32 btf_ptr_size, u64 flags);
      
      struct btf_ptr * specifies
      
      - a pointer to the data to be traced
      - the BTF id of the type of data pointed to
      - a flags field is provided for future use; these flags
        are not to be confused with the BTF_F_* flags
        below that control how the btf_ptr is displayed; the
        flags member of the struct btf_ptr may be used to
        disambiguate types in kernel versus module BTF, etc;
        the main distinction is the flags relate to the type
        and information needed in identifying it; not how it
        is displayed.
      
      For example a BPF program with a struct sk_buff *skb
      could do the following:
      
      	static struct btf_ptr b = { };
      
      	b.ptr = skb;
      	b.type_id = __builtin_btf_type_id(struct sk_buff, 1);
      	bpf_snprintf_btf(str, sizeof(str), &b, sizeof(b), 0, 0);
      
      Default output looks like this:
      
      (struct sk_buff){
       .transport_header = (__u16)65535,
       .mac_header = (__u16)65535,
       .end = (sk_buff_data_t)192,
       .head = (unsigned char *)0x000000007524fd8b,
       .data = (unsigned char *)0x000000007524fd8b,
       .truesize = (unsigned int)768,
       .users = (refcount_t){
        .refs = (atomic_t){
         .counter = (int)1,
        },
       },
      }
      
      Flags modifying display are as follows:
      
      - BTF_F_COMPACT:	no formatting around type information
      - BTF_F_NONAME:		no struct/union member names/types
      - BTF_F_PTR_RAW:	show raw (unobfuscated) pointer values;
      			equivalent to %px.
      - BTF_F_ZERO:		show zero-valued struct/union members;
      			they are not displayed by default
      
      Signed-off-by: default avatarAlan Maguire <alan.maguire@oracle.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/1601292670-1616-4-git-send-email-alan.maguire@oracle.com
      c4d0bfb4
  10. Aug 25, 2020
  11. Jul 18, 2020
    • Jakub Sitnicki's avatar
      bpf: Introduce SK_LOOKUP program type with a dedicated attach point · e9ddbb77
      Jakub Sitnicki authored
      
      Add a new program type BPF_PROG_TYPE_SK_LOOKUP with a dedicated attach type
      BPF_SK_LOOKUP. The new program kind is to be invoked by the transport layer
      when looking up a listening socket for a new connection request for
      connection oriented protocols, or when looking up an unconnected socket for
      a packet for connection-less protocols.
      
      When called, SK_LOOKUP BPF program can select a socket that will receive
      the packet. This serves as a mechanism to overcome the limits of what
      bind() API allows to express. Two use-cases driving this work are:
      
       (1) steer packets destined to an IP range, on fixed port to a socket
      
           192.0.2.0/24, port 80 -> NGINX socket
      
       (2) steer packets destined to an IP address, on any port to a socket
      
           198.51.100.1, any port -> L7 proxy socket
      
      In its run-time context program receives information about the packet that
      triggered the socket lookup. Namely IP version, L4 protocol identifier, and
      address 4-tuple. Context can be further extended to include ingress
      interface identifier.
      
      To select a socket BPF program fetches it from a map holding socket
      references, like SOCKMAP or SOCKHASH, and calls bpf_sk_assign(ctx, sk, ...)
      helper to record the selection. Transport layer then uses the selected
      socket as a result of socket lookup.
      
      In its basic form, SK_LOOKUP acts as a filter and hence must return either
      SK_PASS or SK_DROP. If the program returns with SK_PASS, transport should
      look for a socket to receive the packet, or use the one selected by the
      program if available, while SK_DROP informs the transport layer that the
      lookup should fail.
      
      This patch only enables the user to attach an SK_LOOKUP program to a
      network namespace. Subsequent patches hook it up to run on local delivery
      path in ipv4 and ipv6 stacks.
      
      Suggested-by: default avatarMarek Majkowski <marek@cloudflare.com>
      Signed-off-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200717103536.397595-3-jakub@cloudflare.com
      e9ddbb77
  12. Jul 01, 2020
  13. Jun 25, 2020
  14. May 11, 2020
    • Quentin Monnet's avatar
      bpf: Minor fixes to BPF helpers documentation · ab8d7809
      Quentin Monnet authored
      
      Minor improvements to the documentation for BPF helpers:
      
      * Fix formatting for the description of "bpf_socket" for
        bpf_getsockopt() and bpf_setsockopt(), thus suppressing two warnings
        from rst2man about "Unexpected indentation".
      * Fix formatting for return values for bpf_sk_assign() and seq_file
        helpers.
      * Fix and harmonise formatting, in particular for function/struct names.
      * Remove blank lines before "Return:" sections.
      * Replace tabs found in the middle of text lines.
      * Fix typos.
      * Add a note to the footer (in Python script) about "bpftool feature
        probe", including for listing features available to unprivileged
        users, and add a reference to bpftool man page.
      
      Thanks to Florian for reporting two typos (duplicated words).
      
      Signed-off-by: default avatarQuentin Monnet <quentin@isovalent.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20200511161536.29853-4-quentin@isovalent.com
      ab8d7809
  15. May 10, 2020
    • Yonghong Song's avatar
      bpf: Add bpf_seq_printf and bpf_seq_write helpers · 492e639f
      Yonghong Song authored
      
      Two helpers bpf_seq_printf and bpf_seq_write, are added for
      writing data to the seq_file buffer.
      
      bpf_seq_printf supports common format string flag/width/type
      fields so at least I can get identical results for
      netlink and ipv6_route targets.
      
      For bpf_seq_printf and bpf_seq_write, return value -EOVERFLOW
      specifically indicates a write failure due to overflow, which
      means the object will be repeated in the next bpf invocation
      if object collection stays the same. Note that if the object
      collection is changed, depending how collection traversal is
      done, even if the object still in the collection, it may not
      be visited.
      
      For bpf_seq_printf, format %s, %p{i,I}{4,6} needs to
      read kernel memory. Reading kernel memory may fail in
      the following two cases:
        - invalid kernel address, or
        - valid kernel address but requiring a major fault
      If reading kernel memory failed, the %s string will be
      an empty string and %p{i,I}{4,6} will be all 0.
      Not returning error to bpf program is consistent with
      what bpf_trace_printk() does for now.
      
      bpf_seq_printf may return -EBUSY meaning that internal percpu
      buffer for memory copy of strings or other pointees is
      not available. Bpf program can return 1 to indicate it
      wants the same object to be repeated. Right now, this should not
      happen on no-RT kernels since migrate_disable(), which guards
      bpf prog call, calls preempt_disable().
      
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Link: https://lore.kernel.org/bpf/20200509175914.2476661-1-yhs@fb.com
      492e639f
  16. Mar 13, 2020
  17. Feb 26, 2020
  18. Jan 14, 2020
  19. Oct 21, 2019
  20. Oct 16, 2019
  21. Oct 10, 2019
  22. Oct 07, 2019
  23. May 12, 2019
  24. May 17, 2018
    • Quentin Monnet's avatar
      bpf: change eBPF helper doc parsing script to allow for smaller indent · eeacb716
      Quentin Monnet authored
      
      Documentation for eBPF helpers can be parsed from bpf.h and eventually
      turned into a man page. Commit 6f96674d ("bpf: relax constraints on
      formatting for eBPF helper documentation") changed the script used to
      parse it, in order to allow for different indent style and to ease the
      work for writing documentation for future helpers.
      
      The script currently considers that the first tab can be replaced by 6
      to 8 spaces. But the documentation for bpf_fib_lookup() uses a mix of
      tabs (for the "Description" part) and of spaces ("Return" part), and
      only has 5 space long indent for the latter.
      
      We probably do not want to change the values accepted by the script each
      time a new helper gets a new indent style. However, it is worth noting
      that with those 5 spaces, the "Description" and "Return" part *look*
      aligned in the generated patch and in `git show`, so it is likely other
      helper authors will use the same length. Therefore, allow for helper
      documentation to use 5 spaces only for the first indent level.
      
      Signed-off-by: default avatarQuentin Monnet <quentin.monnet@netronome.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      eeacb716
  25. May 02, 2018
    • Quentin Monnet's avatar
      bpf: relax constraints on formatting for eBPF helper documentation · 6f96674d
      Quentin Monnet authored
      
      The Python script used to parse and extract eBPF helpers documentation
      from include/uapi/linux/bpf.h expects a very specific formatting for the
      descriptions (single dot represents a space, '>' stands for a tab):
      
          /*
           ...
           *.int bpf_helper(list of arguments)
           *.>    Description
           *.>    >       Start of description
           *.>    >       Another line of description
           *.>    >       And yet another line of description
           *.>    Return
           *.>    >       0 on success, or a negative error in case of failure
           ...
           */
      
      This is too strict, and painful for developers who wants to add
      documentation for new helpers. Worse, it is extremely difficult to check
      that the formatting is correct during reviews. Change the format
      expected by the script and make it more flexible. The script now works
      whether or not the initial space (right after the star) is present, and
      accepts both tabs and white spaces (or a combination of both) for
      indenting description sections and contents.
      
      Concretely, something like the following would now be supported:
      
          /*
           ...
           *int bpf_helper(list of arguments)
           *......Description
           *.>    >       Start of description...
           *>     >       Another line of description
           *..............And yet another line of description
           *>     Return
           *.>    ........0 on success, or a negative error in case of failure
           ...
           */
      
      While at it, remove unnecessary carets from each regex used with match()
      in the script. They are redundant, as match() tries to match from the
      beginning of the string by default.
      
      v2: Remove unnecessary caret when a regex is used with match().
      
      Signed-off-by: default avatarQuentin Monnet <quentin.monnet@netronome.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      6f96674d
  26. Apr 26, 2018
    • Quentin Monnet's avatar
      bpf: add script and prepare bpf.h for new helpers documentation · 56a092c8
      Quentin Monnet authored
      
      Remove previous "overview" of eBPF helpers from user bpf.h header.
      Replace it by a comment explaining how to process the new documentation
      (to come in following patches) with a Python script to produce RST, then
      man page documentation.
      
      Also add the aforementioned Python script under scripts/. It is used to
      process include/uapi/linux/bpf.h and to extract helper descriptions, to
      turn it into a RST document that can further be processed with rst2man
      to produce a man page. The script takes one "--filename <path/to/file>"
      option. If the script is launched from scripts/ in the kernel root
      directory, it should be able to find the location of the header to
      parse, and "--filename <path/to/file>" is then optional. If it cannot
      find the file, then the option becomes mandatory. RST-formatted
      documentation is printed to standard output.
      
      Typical workflow for producing the final man page would be:
      
          $ ./scripts/bpf_helpers_doc.py \
                  --filename include/uapi/linux/bpf.h > /tmp/bpf-helpers.rst
          $ rst2man /tmp/bpf-helpers.rst > /tmp/bpf-helpers.7
          $ man /tmp/bpf-helpers.7
      
      Note that the tool kernel-doc cannot be used to document eBPF helpers,
      whose signatures are not available directly in the header files
      (pre-processor directives are used to produce them at the beginning of
      the compilation process).
      
      v4:
      - Also remove overviews for newly added bpf_xdp_adjust_tail() and
        bpf_skb_get_xfrm_state().
      - Remove vague statement about what helpers are restricted to GPL
        programs in "LICENSE" section for man page footer.
      - Replace license boilerplate with SPDX tag for Python script.
      
      v3:
      - Change license for man page.
      - Remove "for safety reasons" from man page header text.
      - Change "packets metadata" to "packets" in man page header text.
      - Move and fix comment on helpers introducing no overhead.
      - Remove "NOTES" section from man page footer.
      - Add "LICENSE" section to man page footer.
      - Edit description of file include/uapi/linux/bpf.h in man page footer.
      
      Signed-off-by: default avatarQuentin Monnet <quentin.monnet@netronome.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      56a092c8
Loading