GitXplorerGitXplorer
h

tblmonit

public
4 stars
3 forks
2 issues

Commits

List of commits on branch master.
Verified
42249846f661f56036e77620d5a8da22eae478bf

update ci (#19)

hhirosassa committed 2 years ago
Verified
7fefd9fcd2eb03d5c058185fb8d7d829cfa9beec

Update go version (#17)

hhirosassa committed 3 years ago
Verified
e9a07bd175fd954e0707222c00d1fb23ad0b7498

update go version (#16)

hhirosassa committed 4 years ago
Verified
000c36e04ea586ae55de5bf875d6ab5c3d4c733d

Show details (#15)

kkitagry committed 4 years ago
Verified
baba7fd5925aa76fd3712a884dcd0fd0ef99003a

To set option (#14)

kkitagry committed 4 years ago
Verified
43f5133faa3ee4278a943c54a82f46185e51ce58

Set timeThreshold and durationThreshold optional and fix isOld logic (#13)

kkitagry committed 4 years ago

README

The README file for this repository.

tblmonit

Monitoring tool for BigQuery table's metadata

Usage

Set config file

You can set timezone in $HOME/.tblmonit.yaml, or set --config option.

The default timezone is Local. And, the name is taken to be a location name corresponding to a file in the IANA Time Zone database, such as America/New_York.

Example:

timeZone: Asia/Tokyo

Check freshness of tables

First of all, you need to prepare configuration file for listing target tables to monitor in TOML format like below:

[[Project]]
    ID = "bigquery-project-id-1"
    [[Project.Dataset]]
        ID = "dataset1"
        [[Project.Dataset.TableConfig]]
            Table = "table1"
            DateForShards = ""
            Timethreshold = "09:00:00"
            DurationThreshold = "24h"
    [[Project.Dataset]]
        ID = "dataset2"
        [[Project.Dataset.TableConfig]]
            Table = "sharded_table2_on_"
            DateForShards = "ONE_DAY_AGO"
            Timethreshold = "12:00:00"
            DurationThreshold = "1h"
[[Project]]
    Name = "bigquery-project-id-2"
    [[Project.Dataset]]
        ID = "dataset3"
        [[Project.Dataset.TableConfig]]
            Table = "table1"
            DateForShards = ""
            Timethreshold = "09:00:00"
            DurationThreshold = "24h"

Then, run command as follows:

tblmonit freshness [target config file]

If current time is passed TimeThreshold and the target table's last modified date is older than DurationThreshold(or the table is not found), then tblmonit outputs a list of such tables in following format

bigquery-project-id-1.dataset1.sharded_table2_on_20200101
bigquery-project-id-2.dataset3.table2

DateForShards is for sharded table partitioned by date (tables' suffix should be YYYYMMDD format).

DateForShards should be one of ONE_DAY_AGO, TODAY, FIRST_DAY_OF_THE_MONTH.

Flexible configuration (experimental)

This feature is under experimental

Editing configuration file manually usually cause errors. To specify list of tables and thresholds easily, you can use FlexConfig DSL and expand command.

First of all, you should prepare toml file like below:

[[FlexProject]]
    ID = "bigquery-project-id-1"
    [[FlexProject.FlexDataset]]
        ID = "dataset1" # you can use regular expression
        [[FlexProject.FlexDataset.FlexTableConfig]]
            FlexTable = "*"  # you can use regular expression to specify tables
            DateForShards = "ONE_DAY_AGO"
            # TimeThreshold or DurationThreshold must specify
            Timethreshold = "09:00:00"
            DurationThreshold = "24h"
[[FlexProject]]
    ID = "bigquery-project-id-2"
    [[FlexProject.Dataset]] # not FlexDataset for exact
        ID = "dataset1"
        [[FlexProject.Dataset.TableConfig]]
            FlexTable = "*"
            DateForShards = "ONE_DAY_AGO"
            # TimeThreshold or DurationThreshold must specify
            Timethreshold = "09:00:00"
            DurationThreshold = "24h"

And then, run following command

tblmonit config expand [target config file]

As a result, the command outputs following config file which is acceptable by tblmonit freshness command

[[Project]]
  ID = "bigquery-project-id-1"

  [[Project.Dataset]]
    ID = "dataset1"

    [[Project.Dataset.TableConfig]]
      Table = "table1"
      DateForShards = ""
      TimeThreshold = "2020-12-13T09:00:00+09:00"
      DurationThreshold = "24h0m0s
    [[Project.Dataset.TableConfig]]
      Table = "sharded_table2_on_"
      DateForShards = "ONE_DAY_AGO"
      TimeThreshold = "2020-12-13T09:00:00+09:00"
      DurationThreshold = "24h0m0s