gigapxy

NAME
DESCRIPTION
SETTING UP
PREPARING TO RUN
RUNNING
AUTHORIZATION
Expiration date for trial versions of gigapxy
AUTHORS
SEE ALSO

NAME

Gigapxy − an inter-protocol data stream relay and proxy.

DESCRIPTION

Gigapxy pipes data channels to corresponding clients; either of the two endpoints may be a network socket or a file.

Basic terminology and use cases
Gigapxy uses the term channel for a data source and client for a destination. This terminology has roots in IPTV (IP television) operations: an IPTV provider gives its service subscribers (clients) access to TV/video channels via an IP-based network.

A common scenario using Gigapxy would be feeding (UDP) data from M multicast channels to N (TCP) clients/subscribers: media players, tools such as wget, curl, etc. A client, in fact, could be any application issuing an appropriate HTTP request.

Gigapxy is designed to serve many clients per channel, efficiently and economically. A built-in caching mechanism allows new clients to start reading cached (channel) data at once, minimizing the delay associated with making a new connection or a multicast group subscription. For the end user that means that changing IPTV channels becomes very fast.

Application modules
Gigapxy
is a server application; its services include relaying data to to clients and performing administrative tasks, such as reporting application statistics in various formats.

The two core modules of Gigapxy are:
gws (Gigapxy Web Service)

processes, validates and dispatches user requests, handles administrative tasks;

gng (Gigapxy Engine)

serves data to clients.

The two modules run as separate processes, a single instance of gws(1) controls a number (N >= 1) of gng(1) processes.

gws(1) processes and validates a user request for data, sets up input and output ends of associated data streams, then dispatches the request to the appropriate gng(1) instance to handle data transfer. If gws receives an administrative request, it may service it locally or relay to an appropriate gng.

A gng(1) , on its end, is fully dedicated to channel−to−client data transfer; it is not to be affected by delays associated with HTTP request processing or even a crash of the controlling gws(1) instance.

gng reports to gws the events for the clients and channels that it is handling; the reports let gws track the data streams and load−balance requests between multiple gng instances.

gng also (if configured) regularly updates traffic performance statistics (TPS) that gws needs to produce traffic reports.

SETTING UP

Gigapxy is built as a single executable binary (named gigapxy), with two soft links to it set up by the installation process, the links denote the modules: gws and gng.

To see an overview of command-line parameters accepted by a module, run it with one of the following command−line parameters: −h, −?, −−help or −−options

As one might expect, a command-line parameter always overrides the corresponding setting in the configuration.

Please note that running a module without any option is NOT equivalent to requesting help summary; it will just run the module in default configuration.

Most of Gigapxy’s parameters should be specified in module−designated config files. The following are the default locations for configuration files: Gigapxy will look for either gws.conf or gng.conf (depending on the module being launched) and then (if neither could be opened) for gigapxy.conf in each of those locations unless a full path is specified at command line.

(current directory)

/etc

/usr/local/etc

If configuration file path is specified at command line, the module will only try to open that a file at that particular path.

The installation provides ’/etc/gigapxy.conf’ as the default configuration file containing sections for both gws and gng. However, each module’s section could be put into a separate file and passed to the module via the ’-C|--config’ command-line parameter.

The documentation includes a fully annotated configuration file, with every possible option specified and commented on, at:

/usr/share/doc/gigapxy/examples/gigapxy-commented.conf on Linux, or at the corresponding /usr/local location on FreeBSD.

PREPARING TO RUN

Gigapxy can run in a terminal or as a daemon. To run as a daemon, it must be started with root privileges. Root privileges are not required after a short period of initialization; therefore it is suggested that the module run in a non−privileged mode, under a non−root user. The default configuration has application modules (started as root) switch to non-privileged gigapxy account, which is automatically created at installation point with the home directory of /var/run/gigapxy. The user is not removed at de−installation for safety reasons.

Log−file directory /var/log/gigapxy is automatically created at the installation point. No module will start without being able to write into a log file.

NB: It makes sense to have your logs reside in a designated partition that is not shared with your system’s root directory. Setting up for log−rotation and archival are two related tasks that must not be overlooked.

It is suggested that all module instances write their own log, although writing into a shared log is also possible.

System log is automatically updated when a module runs as a daemon.

RUNNING

The very base topology of Gigapxy is running one controlling gws hooked up with a single gng. This would utilize only two CPUs/cores, so you might want to add more gng instances to spread channels across available cores. Mind that all clients of a single channel go to the one designated (by their controlling gws) gng. If you want to balance clients of the same channel across multiple gng’s, you would have to introduce more gws instances, each of them handling its portion of the channel’s load.

The only rule to follow starting up modules is that a gws process should always start before any gng instances it controls. (You could test-run your topology in separate terminal windows to see how it works.) An example control script has been provided at:

/usr/share/gigapxy/scripts/gigapxy.sh on Linux or at the corresponding /usr/local location on FreeBSD

Before the configuration is finalized and the process is not (yet) fully automated, the two command-line options: −T and −v may be quite helpful. (See command-line options.)

The −v option works cummulatively: it allows for up to −vvvv to specify the deepest (debug) level of verbosity in the log output. If you are testing a particular feature or trying to reproduce a bug, this is the way to run for the log to be most helpful to the support team.

NB: Debug logs grow VERY large very fast so please make sure you have enough space in your dedicated log partition and log rotation set up. gng is especially verbose in its debug output so take extra caution there: provide both the space and the log storage fast enbough to handle a lot of writing without serious performance degradation.

For the −T option, bear in mind, that invoking it will disable switching to an alternate (non−privileged) user from root.

Running Gigapxy is trivial once the configuration has been properly set up: launch the modules individually or via a control script.

Gigapxy can be considered running and fully functional when a gws(1) is running with at least one gng(1) controlled by it. A gws(1) may run on its own without a single gng(1) attached but it will not be fully functional: it will NOT be accepting user requests for data until a gng connects. You could still check on it by requesting a status report via admin port.

A gws(1) shutting down gracefully (after one of the quitP signals: TERM, QUIT or INT will also shut down all its engines. This behavior could be overridden in the configuration to safeguard against abnormal situations or bugs causing a graceful exit(3) instead of a crash. Please refer to the gws.conf(5) for details.

If a gws(1) crashes, the subservient engines will not shut down at once but after N attempts to re−connect with a gws(1) at the same (socket) path. The associated parameters are also configurable.

Requesting data (user requests)
gws
(1) has listeners on two ports for user and admin HTTP requests. The user−request formats are:

URIs for multicast sources support SMM (source−specific multicast) via {source−addr}:{mcast−addr}:{mcast−port} specifier.
a) http://{addr}:{gws_port}/{cmd}/{mcast−addr}:{mcast−port} OR
http://{addr}:{gws_port}/{cmd}/{src−addr}@{mcast−addr}:{mcast−port}

WHERE
{addr}:{gws_port} ::= IPv4/6 address of the user-request listener;
{cmd} ::= udp;
{mcast−addr}:{mcast−port} ::= IPv4/6 address of the mulitcast group;
{src−addr} ::= source address for (SSM).

NB: IPv6 addresses are always specified as [{addr}]:port, as in [ff18::1]:5056.

This (udpxy−style) type of request specifies multicast group as the data source and the requesting HTTP connection as the destination.
b) http://{addr}:{gws_port}/src/{channel−uri}/dst/{client−uri}

WHERE
{addr}:{gws_port} ::= IPv4/6 address of the user−request listener;
{channel−uri} ::= URI for the channel (see format below);
{client−uri} ::= URI for the client (see format below);

URI format: {protocol}://{path}?{query}
c) http://{addr}:{gws_port}/${alias}

This type of request uses a channel alias: a dollar−sign prefixed name that resolves to a URL for a channel within a group. Refer to channels.conf(5) for details on configuring channels using aliased groups.

Supported protocols are: FILE, TCP, UDP, HTTP. Below are a few examples of requests using different protocols and formats:
a)
http://acme.com:8080/src/file:///opt/data/somefile.dat/dst/?a=bb&c=dd

gws(1) is listening on port 8080 at acme.com

Channel is a file with the full path: /opt/data/somefile.dat

The request has an associated query ’a=bb&c=dd’ which could be used to specify additional parameters for the session.

Client (dst) is not specified, which defaults to the connection of the HTTP request.

The contents of /opt/data/somefile.dat will be sent to the client; at EOF point the engine will wait (in a non-blocking manner) for the file to expand (be appended with more data) and, if the file gets expanded, will send the new data to the client. If the file does not expand within a certain (configurable) time period, the channel will time out and the clients’ sessions will be terminated.

b)
http://acme.com:8080/src/udp://[ff18::1]:5056/dst/file:///opt/data/somefile.dat

Channel is a multicast group with IPv6 address ff18::1, port 5056

Client is a file with the path: /opt/data/somefile.dat

The engine will write any data arriving for the channel (multicast group) into the named file. The channel may time out if no data arrive within a certain time period, in which case the session will be closed. If there’s an error writing to the destination file, the session will also end.

c) http://acme.com:8080/src/udp://[ff18::1]:5056/dst/
d) http://acme.com:8080/udp/[ff18:1]:5056

The two requests above are equivalent (just stated in two different formats).

Both specify channel as the multicast group [ff18:1]:5056 and the (requesting) HTTP connection as the client. A timeout may occur on either of the network connections here, either of the two connections could also be broken by the peer, thus terminating the session.

e)
http://acme.com:8080/src/http://10.0.1.12:4056/udp/224.0.2.26:4033?kk=yy/dst/tcp://192.168.12.10:5051?mm=ff

specifies that channel data comes as a response to the HTTP GET /udp/224.0.2.26:4033?kk=yy request sent to http://10.0.1.12:4056. Whatever application handles HTTP requests at that address is expected to reply with a data stream destined to a TCP socket connected to the address: 192.168.12.10:5051. This session also has an associated query: ´mm=ff´, which could have a meaning in the context of the given session.

This request underlines Gigapxy’s capability to cascade or ’daisy-chain’ requests, and, therefore, link its instances or itself up with other applications compliant with either of the two request formats (’udp-channel’ and ’src-dst pair’). A chain, such as, for instance, udpxy -> gigapxy -> udpxy -> media player, is made possible by this functionality.

f) http://acme.com:8080/$TV9

requests to use an aliased channel TV9 as the source, the destination defaulting to the requesting connection.

g) http://acme.com:8080/src/$TV9?key=BF094744c5/dst

requests the same aliased channel in gigapxy format and appends the key parameter to the URL the alias resolves to.

h) http://acme.com:8080/udp/10.0.11.26@224.0.2.26:5050 OR
http://acme.com:8080/src/udp://10.0.11.26@224.0.2.26:5050/dst/

request (via SSM) a multicast channel at 224.0.2.26:5050 coming from 10.0.11.26, taking advantage of IGMPv3.

For further details on aliased channels one should refer to channels.conf(5)

HTTP URL re−direction
A client could be re−directed to an alternate source if the requested channel happens to be unavailable at the time. gws would reply with HTTP 302 (Moved Temporarily) in the hope that the client software recognizes the code and would follow the re−direction link. gws performs a basic comparison check to ensure that there’s no re−direction loop, yet the responsibility (re−direction loop detection & prevention) lies on the client side.

HTTP HEAD support
HTTP HEAD requests can be used to check for channel availability. gws treats HTTP HEAD in the same manner as it would treat a GET, with the exception that it would not send back any channel data; neither would it forward any information to a gng. Re−direction, however, is still performed as appropriate.

Administrative requests
Gigapxy
listens on a dedicated TCP port for administrative requests. The request types are as below:

a) reports: http://{addr}:{port}/report?type={type}&format={format}&cached={0|1}

WHERE:

{type} ::= traffic|tps

{format} ::= html|web|xml

Gigapxy supports the following types of reports:

TPS (traffic, tps) - throughput statistics on active channels and clients.

The available report-output formats are:

HTML (html, web) - output as an HTML/web page.

XML (xml) - output as an XML page.

Other popular formats, such as json are also planned for the future.

Note: generation of throughput statistics should be enabled in appropriate config settings for TPS reports to work.

Caching: gws(1) may cache its reports for a certain time period, defined as ws.report.cache_timeout_ms in gws.conf(5) The request URL may request invalidation of the cache by using cached=0 parameter. NB: this is to be used when getting the most actual data is critical. In all other cases, using cached reports would be a wiser choice, saving CPU resources when many report requests come in close proximity.

b) drop/disconnect a channel or a client: http://{addr}:{port}/drop?channel={channel_tag}&client={client_tag}

WHERE:

{channel_tag} is the name tag for the channel;

{client_tag} is the name tag for the client (within the given channel). If client parameter is missing, then channel={channel_tag} with all its clients will be disconnected.

Both channel and client must be specified exactly as TPS reports display them. For instance, for a multicast channel tagged as UDP://224.0.12.15:7010 (please do mind that URI parameters, such as authorization credentials etc., are not included) and a client tagged as TCP://192.168.10.15:50905, with gws listening for admin requests on 127.0.0.1:4047, the request:

http://127.0.0.1:4047/drop?channel=UDP://224.0.2.15:7010&client=TCP://192.168.10.15:50905 will drop (disconnect) only the client, leaving the channel up and running, whereas

http://127.0.0.1:4047/drop?channel=UDP://224.0.2.15:7010 would drop (disconnect) all clients within the channel and cancel/disconnect the channel’s inbound data stream.

gws, upon receiving a ’drop’ request, looks up the channel record (but not the client), locates the appropriate gng and relays the request to it. It is not the responsibility of gws to fulfill the request (since gng handles it from there), so gws would report success (HTTP 200 OK) as soon as the request is sent to gng. If the client in the request is invalid, the error will only be discovered by gng which sends no feedback to the request’s origin. Should the request be successfully fulfilled by gng, it will report client/channel drops to gws, resulting in appropriate entries added to the access log (see gws.conf(5) for more info on gws logs).

c) ping/status of the service: http://{addr}:{port}/ping or http://{addr}:{port}/status; status keyword is supported to comply with the udpxy status command (which, in effect, resulted in a status report), which is NOT the equivalent ping, nevertheless, was used to check if the service is up; the preferred keyword for gigapxy is, of course, ping. gws returns HTTP 200 whenever it receives the command.

d) disconnect all clients and channels: http://{addr}:{port}/reset - this will have gws send SIGUSR2 to all attached gng instances. SIGUSR2 directs a gng to drop all its channels and clients.

AUTHORIZATION

Gigapxy utilizes authorization helpers − user−supplied components − communicating with gws(1) via STDIN and STDOUT. With authorization enabled (via config), each user request results in an authorization request sent to a vacant auth helper. An illustrative example of a helper is prodvided at:

/usr/share/gigapxy/scripts/gauth.sh

An authorization request is a text string terminated by CR/LF, with the following fields separated by whitespace:

[ID] [peer] [source] [destination] [CRLF]

[ID] is A{num}, where {num} is a sequence number generated by gws; Example: A3404;

[peer] is combined IP address and port of the remote host requesting access; Example: 104.12.33.67:12301;

[source]: URI of the channel being requested and the authorization token; $Example: udp://224.0.2.12:5011?auth=ef031204ba0c.

NB: The format of the authorization token is not dictated in any way by gws: it’s a mere convention between the client requesting access and the (user−defined) authorization logic embedded in the helper. gws passes what it recognizes as source to auth helper as is.

[destination] is URI for the destination or ’−’ for destination being the requesting TCP connection; Example: −;

[CRLF] is a sequence of two symbols with ASCII codes 0x0d and 0x0a.

The example request will be as below:

A3404 104.12.33.67:12301 udp://224.0.2.12:5011?auth=ef031204ba0c -

The helper validates the request and responds in the following format:

[ID] [result] [CRLF]

[ID] is the request ID, i.e. A3404 in our case.

[result] is a numeric value that gws recognizes as an approval code if 0 (zero) and as denial otherwise.

Therefore, an approval for the request above should look as:

A3404 0

NB: A denial code could be arbitrary as long as it is non−zero; gws logic recognizes no difference between 1 and 210045, they both indicate denial of access and result in the 403 Forbidden HTTP response being forwarded to the client; then the client session ends.

Since gws does not have any guarantee that a helper would not block on a request, it times out auth requests using the applicable settings for user requests (please see gws.conf(5) for the particular settings). If a client/user request times out on handling an authorization task, the engaged auth helper gets kill(2) −ed.

Do make sure your time−out settings for user requests are well−balanced to allow ample time for auth requests to complete gracefully. Also, ensure that enough auth helpers are running to distribute requests to. gws(1) issues warnings about a slow auth helper when it detects one (at a time−out), a sequence of such warnings would indicate a mis−comfiguration issue.

Expiration date for trial versions of gigapxy

Please not that all beta versions come with an expiration date that is displayed in round brackets in application info. Running a gigapxy module (gws or gng) with -V option will display the application info line. Gigapxy will not run past the expiration date or if it cannot reliably tell what time it is, by contacting an NTP service over the internet. This feature is not applicable to non-trial (commercially licensed) versions of gigapxy.

AUTHORS

Pavel V. Cherenkov

SEE ALSO

gws(1),gng(1),gws.conf(5),gng.conf(5),channels.conf(5),gigapxy.auth(5)