GitXplorerGitXplorer
l

knox-config-docs

public
1 stars
0 forks
0 issues

Commits

List of commits on branch master.
Unverified
0d16de2917ccdc97ac7ede211383782a4edfe33c

More updates

llfrancke committed 6 years ago
Unverified
7ce47ee375d211dfa08f443bc1f9f2c6395e3422

More documentation on the startup process

llfrancke committed 6 years ago
Unverified
867f00e23fdad388637fa340cb6ce2c530434e27

Updates

llfrancke committed 6 years ago
Unverified
1d92b809b05805630b593160b222395de50f81b1

Initial commit

llfrancke committed 6 years ago

README

The README file for this repository.

= Introduction

=== Out Of Scope

  • Platforms other than Linux (e.g. Windows)
  • Tools other than Knox CLI and the Gateway itself

= Maven Assemblies

Knox uses the Maven Assembly plugin to build various JAR files. The important ones for us are:

  • knoxcli.jar ** Main Class: org.apache.knox.gateway.launcher.Launcher ** Actual Main Class: org.apache.knox.gateway.util.KnoxCLI ** Maven Module: knox-cli-launcher
  • gateway.jar ** Main Class: org.apache.knox.gateway.launcher.Launcher ** Actual Main Class: org.apache.knox.gateway.GatewayServer ** Maven Module: gateway-server-launcher

= Shell scripts

There is a bash script for each use-case:

  • knoxcli.sh
  • gateway.sh
  • knox-env.sh ** Not called directly but sourced by the others

The CLI tools don't start the relevant classes (e.g. GatewayServer) directly but instead starts a Launcher class (see above) which runs some bootstrapping code and then starts the relevant main class later.

== gateway.sh

  1. Sources knox-env.sh
  2. It checks whether various directories are readable and/or writable
  3. It runs Java with various parameters which cannot be overridden.

WARNING: A Problem here is that the gateway.sh script uses a different way of resolving directories than the Java code. So it can fail startup even if the Java code would work.

We can not resolve this problem when directories can be configured using gateway-site.xml. Environment variables and Java properties we could control because we're the script that passes them in but gateway-site.xml is not under our control.

== knoxcli.sh

This works very similar to gateway.sh with fewer options so I'd apply the same changes here.

== Proposed changes

  1. https://github.com/anordal/shellharden[shellharden] & https://github.com/koalaman/shellcheck[shellcheck]
  2. Stop redirecting stderr & stdout to a file when the server runs in the foreground
  3. Introduce environment variables with a prefix of KNOX_GATEWAY_ that allow changing the internal ones ** e.g. KNOX_GATEWAY_LOG_OPTS
  4. Remove the hardcoded checks for directory existence (TODO: Remove or fix if possible?)
  5. Move the function from knox-env.sh to knox-functions.sh
  6. Remove the empty ENV_PID_DIR from knox-env.sh
  • Emit a warning first if it's set outside
  1. Document all possible environment variables in knox-env.sh but leave them commented
  2. Entirely remove the checks for readability & writability of directories

TODO: Not sure if it would also make sense to search for knox-env.sh in a different location (e.g. config dir)

= Java

== Launcher

  1. Finds the directory where its own JAR is located
  2. Calculates a "launcher name" which is based on the JAR file that's loaded (e.g. gateway.jar -> gateway) ** These jar files are also located in the bin directory right next to the shell scripts
  3. It loads a Properties file called launcher.cfg file from the classpath (META-INF/launcher.cfg) ** This default property file contains four properties: *** GATEWAY_HOME -> ${launcher.dir}/.. *** log4j.configuration -> "${GATEWAY_HOME}/conf/${launcher.name}-log4j.properties *** main.class -> org.apache.knox.gateway.GatewayServer *** class.path ../conf;../lib/\*.jar;../dep/*.jar;../ext;../ext/*.jar
  4. It reads two variables from this embedded file called "override" and "extract" ** These are not present in the launcher.cfg for the Gateway or the KnoxCLI so they always default to true
  5. If override is true Launcher now tries to find and/or create a file called ${launcher name}.cfg in the launcher jar directory ** If it can't find this file it'll create a copy of its internal one
  6. It'll set the properties launcher.dir (e.g. the bin dir) and launcher.name (e.g. gateway)
  7. Launcher then calls Command with the properties (now six by default) it just created ** It first runs the constructor and then immediately calls run()

== Command

=== Constructor

  1. Command first consumes the passed in properties by loading the following properties into instance variables and removing them from the Properties object ** main.class, main.method (default: main), class.path, fork (default: false), redirect (default: false), restream (default: true) ** TODO: It also does something with args[] ** The passed properties now have four properties left: GATEWAY_HOME, log4j.configuration, launcher.name, launcher.dir
  2. Launcher now separates these into two parts: vars are all properties that start with env. and it stores those in an instance variable. All others end up in props ** By default props now contains those four properties from the last step and vars is empty

=== run() Run makes a decision based on the variable fork. If it's true (default is false) it runs a class called Forker which at the moment throws an UnsupportedOperationException.

TIP: This fork property along with Forker could be removed as no one can use it anyway

The normal path now calls Invoker#invoke(Command).

== Invoker

  1. This first takes all props from the command and it sets all of them as Java System properties ** It first resolves variables (e.g. ${launcher.dir})

WARNING: This is part of the problem! I don't think we should be setting these properties here at least not when they come from the embedded launcher.cfg

  1. It creates a Classloader with the classPath created earlier
  2. It loads the main class
  3. It finds the main method
  4. It invokes the main method

== GatewayServer

  1. It first initializes logging from the log4j.configuration System property using the PropertyConfigurator
  2. It creates a GatewayConfigImpl class

== GatewayConfigImpl

This is a subclass of the Hadoop Configuration class. So, it has properties which are basically key-value pairs.

.init() Method

  1. It first loads all environment variables into properties called env.${ENV_NAME}
  2. It then loads all Java System properties into properties called sys.${PROPERTY_NAME}
  3. It loads a hardcoded list of files:
  • The files: ** conf/gateway-default.xml ** conf/gateway-site.xml
  • It looks in these places: ** Java System Property GATEWAY_HOME ** Environment Variable GATEWAY_HOME ** Java System Property user.dir ** Classpath (this is where it usually finds the gateway-default.xml file

NOTE: Because we unconditionally set GATEWAY_HOME as a System property in the Invoker the Environment variable cannot take effect + TIP: To work around this you can create an empty conf directory so it cannot find the files. This allows you to use the environment variable.

.initGatewayHomeDir(...) method 1.

== Gateway Home Dir

TODO

== Data Dir

  1. Java Property: TODO
  2. Environment Variable
  3. Configuration Property gateway.data.dir
  4. Default: ${gateway home}/data

=== Services Dir

  1. Configuration Property: gateway.services.dir
  2. Default: ${data dir}/services

=== Applications Dir

  1. Configuration Property: gateway.applications.dir
  2. Default: ${data dir}/applications

=== Deployment Dir

  1. Configuration Property: gateway.deployment.dir
  2. Default: ${data dir}/deployments

== Conf Dir

TODO: No Java Propertyy?

  1. Configuration Property: TODO
  2. Default: ${gateway home}/conf

=== Shared Providers Config Dir

  1. Default: ${conf dir}/shared-providers

=== Descriptors Dir

  1. Default: ${conf dir}/descriptors

=== Topologies Dir

  1. Default: ${conf dir}/topologies

== Log Dir

  1. Java Property: log4j.configuration

Knox uses the PropertyConfigurator to initialize Log4J.

If that is not defined it uses a default from its embedded gateway.cfg file which points to ${GATEWAY_HOME}/conf/${launcher.name}-log4j.properties. For this CSD we've modified the gateway.sh script to take an extra environment variable called GATEWAY_LOG_OPTS which we then point at the right location.

== Implementation notes

=== Changing the path where Knox looks for gateway-log4j.properties

Knox uses the PropertyConfigurator to initialize Log4J using the Java system property log4j.configuration. If that is not defined it uses a default from its embedded gateway.cfg file which points to ${GATEWAY_HOME}/conf/${launcher.name}-log4j.properties. For this CSD we've modified the gateway.sh script to take an extra environment variable called GATEWAY_LOG_OPTS which we then point at the right location.

=== Changing the path where Knox looks for the gateway-site.xml file

Knox looks for gateway-site.xml in these locations in this order:

  1. Java System Property GATEWAY_HOME/conf/gateway-site.xml
  2. Environment variable GATEWAY_HOME/conf/gateway-site.xml
  3. Java System Property user.dir/conf/gateway-site.xml
  4. Classpath conf/gateway-site.xml

The path part conf is hardcoded.

Unfortunately the launcher has an embedded cfg file that contains a hardcoded GATEWAY_HOME property which the Invoker class then propagates to a Java System property. The only way to have the environment variable take effect is by removing the default conf/gateway-site.xml file from the Knox distribution. The directory conf needs to stay though because gateway.sh checks for its existence. The path is hardcoded in the script and cannot be changed even though its pointing to the wrong location. Solution is to create an empty conf directory or to patch the gateway.sh file. This parcel does the former.