Commit Graph

514 Commits

Author SHA1 Message Date
Aaro Altonen 8ebadad579 Move definition of socket_t to util.hh
It's used in multiple places so dedicated place for its definition
is necessary
2020-01-17 13:41:52 +02:00
Aaro Altonen 4370119140 Document frame.hh better 2020-01-17 10:33:28 +02:00
Aaro Altonen aeb3382421 Remove unused code from frame.cc
The media header pointer getters are not used anymore
2020-01-17 10:28:03 +02:00
Aaro Altonen 8fae92b4e1 Rename frame types more logically 2020-01-17 10:18:11 +02:00
Aaro Altonen 84715a77a6 Remove warnings 2020-01-16 10:10:26 +02:00
Aaro Altonen eeae6981e3 Update README
Remove all unnecessary information about kvzRTP's internal workings
and leave only the compiling/linking/usage information
2020-01-16 10:10:19 +02:00
Aaro Altonen 751ab6983d Merge branch 'dynamic-config' into develop 2020-01-13 10:07:26 +02:00
Aaro Altonen 84441b39c4 Add examples of new dynamic configuration 2020-01-13 10:06:51 +02:00
Aaro Altonen 7cf1c4d36d Make dynamic configuration connection-based
Making configuration global was moronic considering there are
different types of media streams per session (f.ex Opus and HEVC)
which have very different types of needs. For example, setting
receiver's UDP buffer size to 40 MB would make no sense for Opus.

Now each connection can be configured individually which is also
a needed feature for SRTP

This change reverted the changes made earlier to global API
2020-01-13 10:02:34 +02:00
Aaro Altonen d0f18d4864 Prepare kvzRTP for SRTP/SRTCP support
The security layer is injected between reading a datagram from OS and
RTP/RTCP payload processing so the obvious place for that layer is socket.

Make all recv/send function calls go through socket API so the security
layer function calls doesn't have to be copied everywhere
2020-01-10 10:30:47 +02:00
Aaro Altonen 730bd3ab78 Update README 2020-01-09 10:24:57 +02:00
Aaro Altonen 66f6acd0eb Add Windows support for the latest changes 2020-01-09 10:10:58 +02:00
Aaro Altonen 4b5efed3fc Make read datagram count dynamic based on the state of frame
To prevent excess relocations but to minimize the nubmer of system
calls done, OFR reads 15 datagrams from OS using one system call if
more than 2% or less than 98% of the frame has been read.

These values are a result of experimentation and they lowered the CPU
utilization most. Compared to simple receiver, OFR, with dynamic datagram
read size, uses 14% less CPU than.

These numbers could be improved even further if media-specific
optimizations would be done such as keeping track of intra or VPS
period to adjust max datagram read size or the legal interval for
max read.

This optimizations are, however, probably not going to yield a lot
of benefit compared to the current state of OFR and are thus not
implemented. As it is, OFR is already able to receive HEVC at
580 MB/s and uses 14% less CPU than simple so for high-quality video
conferencing situations with multiple participants this is a good
choice.
2020-01-09 09:30:22 +02:00
Aaro Altonen 3e981ec4ae Switch from compile-time to runtime configuration of kvzRTP 2020-01-08 10:56:09 +02:00
Aaro Altonen 9472c4142c Fix example code
Some of the code used the old API, now fixed
2020-01-07 10:20:20 +02:00
Aaro Altonen 729db0c928 Update kvzRTP
Miscellaneous changes to various files
2020-01-07 09:54:33 +02:00
Aaro Altonen e5bdd39ce4 add latency testing programs for kvzrtp and ffmpeg 2019-12-09 18:25:59 +02:00
Aaro Altonen 3e127885e0 improve ffmpeg and kvzrtp sending 2019-12-09 18:25:18 +02:00
Aaro Altonen ab9ae93124 Fix the normal way of sending for HEVC
The version that does not use sendmmsg(2) didn't return proper status
codes for __push_hevc_frame() when it had sent the smaller NAL units.

This caused it to send far less data than it should have
2019-11-27 08:39:56 +02:00
Aaro Altonen ce7aebf97a Add windows support 2019-11-13 08:08:52 +02:00
Aaro Altonen 3053f1b4bf Give memory advice for Linux about the HEVC file
This removes yet a little more lantecy from the setup to get better
theoretical maximum. Both kvzRTP and FFmpeg benefit from this.
2019-11-06 08:58:42 +02:00
Aaro Altonen b9b0cdc4b7 Rewrite FFmpeg send bechmark code 2019-11-01 08:47:24 +02:00
Aaro Altonen 6210c37dbc Use C++'s high-resolution clock to calculate RTP timestamps
The NTP millisecond diff calculation seems to be incorrect (it gives
very weird results) but miraculously it still produced playable video
stream.

I'll need to figure out what's wrong with the calculation at some point
but for now switch to use HRC.
2019-11-01 08:41:41 +02:00
Aaro Altonen 4ce0eecec9 Rewrite kvzRTP sender/receiver benchmarking code
Instead of both files containing both receiver and sender,
separate them into their own files
2019-11-01 08:37:51 +02:00
Aaro Altonen c7e37b9b06 Add FFmpeg receiver code 2019-10-31 11:15:02 +02:00
Aaro Altonen da88aeadfd Fix dispatcher
Compiler flag had disabled it masking the compile errors, now fixex
2019-10-30 11:40:54 +02:00
Aaro Altonen ad3d700f6a Split send_vecio() into writes of 1024 messages
Linux seem to have an undocumented "feature" where it accepts only
1024 messages per sendmmsg(2).

So basically, if you gave it a buffer containing f.ex 1100 messages,
it would only sent the first 1024 **without returning an error**.

This caused large intra frames not to be received fully creating
broken stream
2019-10-30 10:57:52 +02:00
Aaro Altonen 75367367ab Fix warnings 2019-10-29 08:34:44 +02:00
Aaro Altonen d95920908d Readjust offset pointer after the fragments have been processed
When MSG_WAITFORONE is used, the system call returns 1..N packets
but the code initially assumes N packets are read so the offset pointer
might need adjustment after the fragments have been processed.
2019-10-29 08:14:19 +02:00
Aaro Altonen dd26eb2fe3 Fix optimisic hevc receiver
There are still some weird corner cases that are not resolved
but they're very rare. I have to fix them at some poing
2019-10-28 10:10:49 +02:00
Aaro Altonen 2e82d0e0b0 Remove Runner thread related double free
Both reader and runner released the runner_ object resulting
in a double free
2019-10-25 10:03:11 +03:00
Aaro Altonen 6810c27cb5 Keep track of late frames that were discarded
These frames were not received within certain time window and are
no longer of use any use to the application
2019-10-25 09:30:27 +03:00
Aaro Altonen 87c68366aa Make sure the sequence number overflow is considered
The sequence number counter is only 16 bits long meaning that it will
overflow quite fast and can cause S fragment to have larger sequence
number than what E fragment has.

Previous calculation didn't take this into account which caused all
fragment after the first overflow has happened to be discarded
2019-10-25 09:12:24 +03:00
Aaro Altonen 9ec560613f Remove the transaction from queue only after it's deinitialized 2019-10-25 09:07:55 +03:00
Aaro Altonen 0f0a052e54 Create Runner class
Several classes have common active_ and runner_ variables and
stop/start/active routines (such as reader and dispatch).

Create one common class for these to make the interface cleaner
2019-10-24 08:36:54 +03:00
Aaro Altonen bde19e3f7f Create RTCP function for getting session participants' SSRCs 2019-10-24 08:15:46 +03:00
Aaro Altonen c0e1f69081 Separate hooked and polling RTP frame receiving examples 2019-10-24 08:15:20 +03:00
Aaro Altonen 8f2ae2063c Create RTCP hooking and polling examples 2019-10-24 08:05:55 +03:00
Aaro Altonen 1c33e6a0a2 Create RTCP getter
Used to install hooking functions for RTCP packets
2019-10-24 07:49:17 +03:00
Aaro Altonen 931fb74521 Deinitialize transactions that don't get committed to SCD's queue 2019-10-22 12:27:43 +03:00
Aaro Altonen 4385361fdc Create examples for memory deallocation/ownership stuff 2019-10-22 07:22:02 +03:00
Aaro Altonen bcf7c82e48 Add support for deallocation hook
This is the third way of dealing with memory ownership/deallocation.

Application "lends" the memory for kvzRTP's use and when SCD has finished
processing the transaction associated with this memory, it will call the
deallocation hook provided by the application to release the memory.

This makes, for example, custom allocators possible where wrapping the
memory inside a unique_ptr is not suitable and creating copies is also
not acceptable.
2019-10-11 11:23:38 +03:00
Aaro Altonen 1695691d2b Add support for RTP_COPY flag
This was actually very simple thing to do because we can take
advantage of the unique_ptr support added earlier
2019-10-11 10:26:35 +03:00
Aaro Altonen 434a07156c Add unique_ptr support for RTP writer API and all media formats 2019-10-11 09:16:39 +03:00
Aaro Altonen 45d2a4ec26 Add unique_ptr support for frame queue
Now both unique_ptrs and raw pointers can be used
2019-10-11 09:06:35 +03:00
Aaro Altonen eec4ceb41a Add specification on how to deal with memory ownership/deallocation problem 2019-10-10 08:41:09 +03:00
Aaro Altonen e6adf2ad21 Update README 2019-10-10 07:54:35 +03:00
Aaro Altonen d1a6d18adf Create System Call Dispatcher
This is a separate thread running in the background responsible for
executing system calls (mainly sending UDP packets).

This commit divides the sending into frontend and backend:
  - Frontend packetizes the media into "transactions" which are then
    pushed to backend's task queue
  - Backend executes these transactions FIFO-style and pushes executed
    transactions back to frame queue for reuse

Frontend is the part of sending that executes in application's context
and backend (system call) happens in a background thread.

Ideally frontend and backend would be run on separate physical cores.

This change made sending significantly faster (from 650 MB/s to 720 MB/s)
and cut down the delay experienced by application from 315us to 45us
for large HEVC chunks (177 kB)
2019-10-09 10:38:41 +03:00
Aaro Altonen f23e116e04 Start using send_vecio() from socket.cc for frame queue 2019-10-08 08:56:24 +03:00
Aaro Altonen fe14697793 Make the frame queue transaction-based
Using system call dispatcher complicates the frame queue design because
we can no longer store f.ex. NAL and FU headers to caller's stack (as
SCD doesn't have access to that stack).

We must create transaction object that contains all necessary
information related to one media frames (all Vectored I/O buffers,
RTP headers and outgoing address).

This model works both with and without SCD and is much cleaner than the
previous implementation. It also makes a more clear distinction between
the frontend and backend of sending operationg by creating
a clear producer/consumer model.

One problem that has arisen is memory deallocation and ownership in
general: when SCD is used, it owns the memory given to kvzRTP by
push_frame() BUT it doens't know what kind of memory it is so it doesn't
know how to deallocate it. Some kind of deallocation scheme must be
implemented because right now the library leaks a lot of memory.
2019-10-08 08:20:29 +03:00