background

Recently, when designing the log scheme of the company’s container cloud, it was found that the mainstream ELK or EFK is relatively heavy, and many of the complex search functions of ES are not needed at this stage, and finally chose Grafana’s open source Locki log system, the following introduces the background of Loki.
Background and motivationWhen
there is a problem with an application running in our container cloud or a node, the solution idea should be as follows:

Our monitoring is based on the Prometheus system for transformation, the more important of Prometheus is Metric and Alert, Metric is to indicate that the current or historical reached a certain value, Alert set Metric to reach a specific base to trigger an alarm, but this information is obviously not enough. We all know that the basic unit of Kubernetes is Pod, Pod outputs logs to stdout and stderr, usually have any problems we usually view the relevant logs in the interface or through commands, for example: When the memory of one of our pods becomes large and triggers our Alert, at this time the administrator, go to the page to query to confirm which pod has a problem, and then to confirm the reason why the pod memory becomes larger, we also need to query the pod’s logs, if there is no log system, then we need to go to the page or use the command to query:

If the application suddenly hangs up at this time, we will not be able to find the relevant logs at this time, so we need to introduce a log system to collect logs uniformly, and if you use ELK, you need to switch between Kibana and Grafana, affecting the user experience. Therefore, the first purpose of LOKI is to minimize the switching cost of measurement and logging, which helps to reduce the response time of abnormal events and improve the user experience.

Problems

with ELK
Many existing log collection schemes use full-text retrieval to index logs (such as the ELK scheme), which has the advantage of rich functions and allows complex operations. However, these programmes are often complex in scale, resource-intensive and operationally difficult. Many functions are often not used, and most queries only focus on a certain time range and some simple parameters (such as host, service, etc.), and using these solutions is a bit of a killing knife feeling.

Therefore, Loki’s second purpose is to achieve a trade-off between the ease of operation and complexity of the query language.
The cost

of full-text search

also brings cost problems, simply put, the cost of slicing and sharing the inverted index of full-text search (such as ES) is high. Later, other design solutions such as OKlog were introduced, using a ultimately consistent, grid-based distribution strategy. These two design decisions provide a lot of cost reduction and very simple operation, but the query is not convenient enough. Therefore, Loki’s third purpose is to improve a more cost-effective solution.
architecture

> monolithic shelf The
architecture of Locki is as follows:

It is not

difficult to see that Loki’s architecture is very simple, using the same tags as Prometheus as the index, that is, you can query both the content of the log and the monitored data through these tags, which not only reduces the switching cost between the two queries, but also greatly reduces the storage of the log index. Loki will use the same service discovery and tag retagging library as Prometheus, write pormtail, run promtail in Kubernetes in DaemonSet in each node, wait through the Kubernetes API until the correct metadata for the logs, and send them to Loki. The following is the storage architecture for logs:

The

writing of read and write log data mainly relies on the two components of Distributor and Ingester, and

the overall process is as follows:

Distributor
Once promtail collects logs and sends them to loki, Distributor is the first component to receive them. Because logs can be written to a large amount, they cannot be written to the database as they come in. This will destroy the database. We need batch and compress data.
Loki does this by building compressed chunks of data by gzipping logs as they come in, and the ingester component is a stateful component that builds and refreshes chuncks when chunks reach a certain number or time, and flushes them to storage. The log of each stream corresponds to an ingester, and when the log arrives at the Distributor, it calculates which ingester should be on according to the metadata and hash algorithm.

In addition, for redundancy and resiliency, we copy it n (3 times by default).
Ingester Ingester
receives the log and starts building chunk:

Basically, the log is compressed and appended to the chunk. Once the chunk “fills up” (a certain amount of data has reached a certain amount or a certain period of time has passed), ingester flushes it to the database. We use separate databases for blocks and indexes because they store different data types.

After refreshing a chunk

, ingester then creates a new empty chunk and adds a new entry to that chunk.
Querier
The reading is as simple as giving a time range and label selector, Querier looking at the index to determine which blocks match, and displaying the results via greps. It also fetches the latest data from Ingester that hasn’t been refreshed yet.
For each query, a querier will show you all relevant logs. Implement query parallelization and provide distributed grep, making even large queries sufficient.

scalability
Loki’s index storage can be cassandra/bigtable/dynamodb, chuncks can be various object stores, and Querier and Distributor are stateless components. For ingester, although stateful, when new nodes are added or subtracted, the chunk between the whole nodes is redistributed and has been adapted to the new hash ring. The implementation of Loki’s underlying storage Cortex has been in actual production for many years. With this sentence, I can safely experiment in the environment.

Loki is very easy to install 。
Create namespace

oc new-project loki

Permission setting

oc adm policy add-scc-to-user anyuid -z default -n loki
oc adm policy add-cluster-role-to-user cluster-admin system:serviceaccount:loki:default

installs Loki
Install command:

oc create -f statefulset.json -n loki

statefulset.json is as follows

: { “apiVersion”: “


apps/v1", "kind"
: "StatefulSet",
    "metadata": {
        "name""loki"    },

    "spec": {


        "podManagementPolicy""OrderedReady",
        "replicas": 1,
        "revisionHistoryLimit": 10,
        "selector": {
            "matchLabels": {
                "app" "loki"            }        },

        "serviceName""womping-stoat-loki-headless",


        "template": {
            "metadata" : {
                "annotations": {
                    "checksum/config""da297d66ee53e0ce68b58e12be7ec5df4a91538c0b476cfe0ed79666343df72b",
                    "prometheus.io/port""http-metrics",
                    "prometheus.io/scrape""true"                },

                 "creationTimestamp": null,


                "labels": {
                    "app""loki",
                    "name""loki"                 }            },

            "spec": {


                "affinity": {},
                "containers": [                    {

                        "args": [


                            "-config.file=/etc/loki/local-config.yaml"                        ],

                        "image" "grafana/loki:latest",


                        "imagePullPolicy""IfNotPresent",
                        "livenessProbe": {
                            "failureThreshold": 3,
                            "httpGet": {
                                "path""/ready",
                                "port""http-metrics",
                                "scheme""HTTP"                            },

                            "initialDelaySeconds": 45,


                            "periodSeconds": 10,
                            "successThreshold": 1,
                            "
timeoutSeconds": 1                        },

                        "name""loki",


                        "ports" : [                            {

                                "containerPort": 3100,


                                "name""http-metrics",
                                "protocol""TCP"                            }                        ],

                        "readinessProbe": {


                            "failureThreshold": 3,
                            "httpGet": {
                                "path""/ready",
                                "port""http-metrics",
                                "scheme""HTTP"                            },

                            "initialDelaySeconds": 45,


                            "periodSeconds": 10,
                            "successThreshold": 1,
                            "timeoutSeconds": 1                        },

                        "resources": {},


                        "terminationMessagePath" "/dev/termination-log",
                        "terminationMessagePolicy""File",
                        "volumeMounts" : [                            {

                                "mountPath""/tmp/loki",


                                "name""storage"                             }                        ]                    }                ],

                "dnsPolicy""ClusterFirst",


                 "restartPolicy""Always",
                "schedulerName""default-scheduler",
                "terminationGracePeriodSeconds": 30,
                "volumes": [                    {

                        "emptyDir": {},


                        "name""storage"                     }                ]            }        },

        "updateStrategy": {


            "type""RollingUpdate"        }    }

}

Install Promtail
installation command:

oc create -f configmap.json -n loki configmap.json

as follows:

{
    "apiVersion""v1",
    "data": {
        "promtail.yaml" "client:\n  backoff_config:\n    maxbackoff: 5s\n    maxretries: 5\n    minbackoff: 100ms\n  batchsize: 102400\n  batchwait: 1s\n  external_labels: {}\n  timeout: 10s\npositions:\n  filename:  /run/promtail/positions.yaml\nserver:\n  http_listen_port: 3101\ntarget_config:\n  sync_period: 10s\n\nscrape_configs:\n- job_name: kubernetes-pods-name\n  pipeline_stages:\n    - docker: {}\n    \n  kubernetes_sd_configs:\n  - role: pod\n  relabel_configs:\n  - source_labels:\n    - __meta_kubernetes_pod_label_name\n    target_label: __service__\n  - source_labels:\n    -  __meta_kubernetes_pod_node_name\n    target_label: __host__\n  - action: drop\n    regex: ^$\n    source_labels:\n    - __service__\n  - action: labelmap\n    regex: __meta_kubernetes_pod_label_(.+)\ n  - action: replace\n    replacement: $1\n    separator: /\n    source_labels:\n    - __meta_kubernetes_namespace\n    - __service__\n    target_label: job\n  - action: replace\n    source_labels:\n    - __meta_kubernetes_namespace\n    target_label: namespace\n  - action: replace\n    source_labels:\n    - __meta_kubernetes_pod_name\n    target_label: instance\n  - action: replace\n    source_labels:\n    - __meta_kubernetes_pod_container_name\n    target_label: container_name\n  - replacement: /var/log/pods/*$1 /*.log\n    separator: /\n    source_labels:\n    - __meta_kubernetes_pod_uid\n    - __meta_kubernetes_pod_container_name\n    target_label: __path__\n- job_name: kubernetes-pods-app\n  pipeline_ stages:\n    - docker: {}\n    \n  kubernetes_sd_configs:\n  - role: pod\n  relabel_configs:\n  - action: drop\n    regex: .+\n    source_labels:\n    - __meta_kubernetes_pod_label_name\n  - source_labels :\n    - __meta_kubernetes_pod_label_app\n    target_label: __service__\n  - source_labels:\n    - __meta_kubernetes_pod_node_name\n    target_label: __host__\n  - action: drop\n    regex: ^$\n    source_labels:\n    - __service__\n  - action: labelmap\n    regex: __meta_kubernetes_pod_label_(.+)\n  - action: replace\n    replacement: $1 \n    separator: /\n    source_labels:\n    - __meta_kubernetes_namespace\n    - __service__\n    target_label: job\n  - action: replace\n    source_labels:\n    - __meta_kubernetes_namespace\n    target_label: namespace\n  - action: replace\n    source_labels:\n    - __meta_kubernetes_pod_name\n    target_label: instance\n  - action: replace\n    source_labels:\n    -  __meta_kubernetes_pod_container_name\n    target_label: container_name\n  - replacement: /var/log/pods/*$1 /*.log\n    separator: /\n    source_labels:\n    - __meta_kubernetes_pod_uid\n    - __meta_kubernetes_pod_container_name\n    target_label: __path__\n- job_name: kubernetes-pods-direct-controllers\n  pipeline_stages:\n    - docker: {}\n    \n  kubernetes_sd_configs:\n  - role: pod\n  relabel_configs:\n  - action: drop\n    regex: .+\n    separator: ''\n    source_labels:\n    - __meta_kubernetes_pod_label_name\n    - __meta_kubernetes_pod_label_app\n  - action: drop\n    regex: ^([0-9a-z-.] +)(-[0-9a-f]{8,10})$\n    source_labels:\n    - __meta_kubernetes_pod_controller_name\n  - source_labels:\n    - __meta_kubernetes_pod_controller_name\n    target_label: __service__\n  - source_labels:\n    - __meta_kubernetes_pod_node_name\n    target_label: __host__\n  - action: drop\n    regex: ^$\n    source_labels:\n    - __service__\n  - action: labelmap\n    regex:  __meta_kubernetes_pod_label_(.+)\n  - action: replace\n    replacement: $1 \n    separator: /\n    source_labels:\n    - __meta_kubernetes_namespace\n    - __service__\n    target_label: job\n  - action: replace\n    source_labels:\n    - __meta_kubernetes_namespace\n    target_label: namespace\n  - action: replace\n    source_labels:\n    - __meta_kubernetes_pod_name\n    target_label: instance\n  - action: replace\n    source_labels:\n    -  __meta_kubernetes_pod_container_name\n    target_label: container_name\n  - replacement: /var/log/pods/*$1 /*.log\n    separator: /\n    source_labels:\n    - __meta_kubernetes_pod_uid\n    - __meta_kubernetes_pod_container_name\n    target_label: __path__\n- job_name: kubernetes-pods-indirect-controller\n  pipeline_stages:\n    - docker: {}\n    \n  kubernetes_sd_configs:\n  - role: pod\n  relabel_configs:\n  - action: drop\n    regex: .+\n    separator: ''\n    source_labels:\n    - __meta_kubernetes_pod_label_name\n    - __meta_kubernetes_pod_label_app\n  - action: keep\n    regex: ^([0-9a-z-.] +)(-[0-9a-f]{8,10})$\n    source_labels:\n    - __meta_kubernetes_pod_controller_name\n  - action: replace\n    regex: ^([0-9a-z-.] +)(-[0-9a-f]{8,10})$\n    source_labels:\n    - __meta_kubernetes_pod_controller_name\n    target_label: __service__\n  - source_labels:\n    - __meta_kubernetes_pod_node_name\n    target_ label: __host__\n  - action: drop\n    regex: ^$\n    source_labels:\n    - __service__\n  - action: labelmap\n    regex: __meta_kubernetes_pod_label_(.+)\n  - action: replace\n    replacement: $1 \n    separator: /\n    source_labels:\n    - __meta_kubernetes_namespace\n    - __service__\n    target_label: job\n  - action: replace\n    source_labels:\n    - __meta_kubernetes_namespace\n    target_label: namespace\n  - action: replace\n    source_labels:\n    - __meta_kubernetes_pod_name\n    target_label: instance\n  - action: replace\n    source_labels:\n    -  __meta_kubernetes_pod_container_name\n    target_label: container_name\n  - replacement: /var/log/pods/*$1 /*.log\n    separator: /\n    source_labels:\n    - __meta_kubernetes_pod_uid\n    - __meta_kubernetes_pod_container_name\n    target_label: __path__\n- job_name: kubernetes-pods-static\n  pipeline_stages:\n    - docker: {}\n    \n  kubernetes_sd_configs:\n  - role: pod\n  relabel_configs:\n  - action: drop\n    regex: ^$\n    source_labels:\n    -  __meta_kubernetes_pod_annotation_kubernetes_io_config_mirror\n  - action: replace\n    source_labels:\n    - __meta_kubernetes_pod_label_component\n    target_label: __service__\n  - source_labels:\n    - __meta_kubernetes_pod_node_name\n    target_label: __host__\n  - action: drop\n    regex: ^$\n    source_labels:\n    - __service__\n  - action: labelmap\n    regex:  __meta_kubernetes_pod_label_(.+)\n  - action: replace\n    replacement: $1 \n    separator: /\n    source_labels:\n    - __meta_kubernetes_namespace\n    - __service__\n    target_label: job\n  - action: replace\n    source_labels:\n    - __meta_kubernetes_namespace\n    target_label: namespace\n  - action: replace\n    source_labels:\n    - __meta_kubernetes_pod_name\n    target_label: instance\n  - action: replace\n    source_labels:\n    -  __meta_kubernetes_pod_container_name\n    target_label: container_name\n  - replacement: /var/log/pods/*$1 /*.log\n    separator: /\n    source_labels:\n    - __meta_kubernetes_pod_annotation_kubernetes_io_config_mirror\n    - __meta_kubernetes_pod_container_name\n    target_label: __path__\n"     },

    "kind""ConfigMap",


    "metadata": {
        "creationTimestamp""2019-09-05T01:05:03Z",
        "labels": {
            "app""promtail",
            "chart""promtail-0.12.0",
            "heritage""Tiller",
            "release""lame-zorse"        },

        "name""lame-zorse-promtail",


        "namespace""loki",
        "resourceVersion""17921611",
        "selfLink""/api/v1/namespaces/loki/configmaps/ lame-zorse-promtail",
        "uid""30fcb896-cf79-11e9-b58e-e4a8b6cc47d2"    }

}

oc create -f daemonset.json -n loki

daemonset.json is as follows

:

{ "
apiVersion": "apps/v1", "
kind": "DaemonSet",
            "metadata": {
                "annotations": {
                    "deployment.kubernetes.io/revision""2"                 },

                "creationTimestamp""2019-09-05T01:16:37Z",


                "generation": 2,
                "labels": {
                    "app""promtail",
                    "chart""promtail-0.12.0",
                    "heritage""Tiller" ,
                    "release""lame-zorse"                },

                "name""lame-zorse-promtail",


                 "namespace""loki"            },

            "spec": {


                "progressDeadlineSeconds": 600,
                "replicas": 1,
                "revisionHistoryLimit": 10,
                "selector": {
                    "matchLabels": {
                        "app" "promtail",
                        "release""lame-zorse"                    }                },

                "strategy": {


                    "rollingUpdate": {
                        "maxSurge": 1,
                        "maxUnavailable": 1                    },

                    "type" "RollingUpdate"

                },

                "template": {


                    "metadata": {
                        "annotations" : {
                            "checksum/config""75a25ee4f2869f54d394bf879549a9c89c343981a648f8d878f69bad65dba809",
                             "prometheus.io/port""http-metrics",
                            "prometheus.io/scrape""true"                        },

                         "creationTimestamp": null,


                        "labels": {
                            "app""promtail",
                             "release""lame-zorse"                        }                    },

                    "spec": {


                        "affinity": {},
                        "containers": [                            {

                                "args": [


                                     "-config.file=/etc/promtail/promtail.yaml",
                                    "-client.url=http://loki.loki.svc:3100/api/prom/push"                                ],

                                "env": [

                                    {

                                        "name""HOSTNAME",


                                        "valueFrom": {
                                            "fieldRef": {
                                                 "apiVersion""v1",
                                                "fieldPath""spec.nodeName"                                             }                                        }                                    }                                ],

                                "image" "grafana/promtail:v0.3.0",


                                "imagePullPolicy""IfNotPresent",
                                "name""promtail",
                                "ports": [                                    {

                                        "containerPort": 3101,


                                        "name""http-metrics",
                                        "protocol""TCP"                                     }                                ],

                                "readinessProbe": {


                                    "failureThreshold" : 5,
                                    "httpGet": {
                                        "path""/ready",
                                        "port""http-metrics",
                                        "scheme""HTTP"                                    },

                                    "initialDelaySeconds": 10,


                                    "periodSeconds": 10,
                                     "successThreshold": 1,
                                    "timeoutSeconds": 1                                },

                                "resources": {},


                                "securityContext": {
                                    "readOnlyRootFilesystem"true,
                                     "runAsUser": 0                                },

                                "terminationMessagePath""/dev/termination-log",


                                "terminationMessagePolicy""File",
                                "volumeMounts": [                                    {

                                        "mountPath""/etc/promtail",


                                        "name""config"                                     },                                    {

                                        "mountPath""/run/promtail",


                                        "name""run"                                    },                                    {

                                         "mountPath""/var/lib/docker/containers",


                                        "name""docker",
                                         "readOnly"true                                    },                                    {

                                        "mountPath""/var/log/pods",


                                        "name""pods",
                                        "readOnly"true                                     }                                ]                            }                        ],

                        "dnsPolicy""ClusterFirst",


                        "restartPolicy""Always",
                        "schedulerName""default-scheduler",
                         "securityContext": {},
                        "terminationGracePeriodSeconds": 30,
                        "volumes": [                            {

                                "configMap": {


                                    "defaultMode": 420,
                                    "name" "lame-zorse-promtail"                                },

                                "name""config"

                            },                            {

                                "hostPath": {


                                    "path""/run/promtail",
                                    "type" ""                                },

                                "name""run"

                            },                            {

                                "hostPath": {


                                    "path""/var/lib/docker/containers",
                                    "type"""                                },

                                "name""docker"

                            },                            {

                                "hostPath": {


                                    "path""/var/log/pods" ,
                                    "type"""                                },

                                "name""pods"

                            } ] } } }

} Install

the service

oc create -f service.json

-n loki service.json

is as follows:

{
    "apiVersion""v1",
    "kind""Service",
    "metadata": {
        "creationTimestamp"" 2019-09-04T09:37:49Z",
        "name""loki",
        "namespace""loki",
        "resourceVersion"" 17800188",
        "selfLink""/api/v1/namespaces/loki/services/loki",
        "uid""a87fe237-cef7-11e9-b58e-e4a8b6cc47d2"     },

    "spec": {


        "externalTrafficPolicy""Cluster",
        "ports": [            {

                "name""lokiport",


                "port": 3100,
                "protocol""TCP",
                "targetPort" : 3100            }        ],

        "selector": {


            "app""loki"        },

        "sessionAffinity""None",


        "type""NodePort"    },

    "status": {


        "loadBalancer": {}
    }

syntax

Loki provides an HTTP interface, we will not explain it in detail here, you can see: https://github.com/grafana/loki/blob/master/docs/api.md
we here to talk about how to use the query interface.
The first step is to get the metadata type of the current Loki:

curl http://192.168.25.30:30972/api/prom/label{

 "values": ["alertmanager""app""component""container_name""controller_revision_hash""deployment""deploymentconfig""docker_registry""draft""filename""instance"" job""logging_infra""metrics_infra""name""namespace""openshift_io_component""pod_ template_generation""pod_template_hash""project""projectname""prometheus""provider""release""router""servicename""statefulset_kubernetes_io_pod_name""stream""tekton_dev_ pipeline""tekton_dev_pipelineRun""tekton_dev_pipelineTask""tekton_dev_task""tekton_dev_taskRun"" type", "webconsole"]


}

The second step is to get the value of a metadata type:

curl http://192.168.25.30:30972/api/prom/label/namespace/values
{"values":["cicd","default","gitlab", "grafanaserver","jenkins","jx-staging","kube-system","loki","mysql-exporter","new2 ","openshift-console","openshift-infra","openshift-logging","openshift-monitoring","openshift-node" ,"openshift-sdn","openshift-web-console","tekton-pipelines","test111"]}

The third step is to query according to the label, for example:

http://192.168.25.30:30972/api/prom/query?direction=BACKWARD&limit=1000®exp=&query={namespace="cicd"}&start=1567644457221000000& end=1567730857221000000&refId=A

parameter resolution:

    > query: A query syntax is detailed in the following section, {name=~”mysql.+”} or {namespace=”cicd”} |= “error” indicates that the query, namespace is CI/CD in the log, there is an error word information.

  • limit: the number of logs returned

  • start: start time,

  • Unix time representation method The default is that

  • the time end: end time one hour ago, the

  • default is the current time

  • direction: forward or backward, Useful when specifying limit, the default is backward

  • regexp: regex filtering

of the results

LogQL syntax
selector
for the label part of the query expression, In {}, multiple tag expressions are separated by commas

: {app=”mysql”

, name="mysql-backup"}

supported symbols are:

    >=: exactly the same.

  • !=: Inequality.

  • =~: Regular expression matching.

  • !~: Do not match regular expressions.

After

the filter expression
is written to the log stream selector, You can further filter the results by writing search expressions. The search expression can be a literal or regular expression.
For example:
    {

  • job=”mysql”} |= “error”

  • {name=“kafka”} |~ “tsdb-ops.*io:2003”

  • {instance=~“kafka-[23]”,name=“kafka”} != kafka.server:type=ReplicaManager

Multiple filtering is supported:
    {

  • job=”mysql”} |= “error” != “timeout”

Currently supported operators:
  • |= line contains strings.

  • != line does not contain strings.

  • | ~ line matches regular expressions.

  • !~ line does not match the regular expression.

Expressions follow https://github.com/google/re2/wiki/Syntax syntax.

end



public number (zhisheng) reply to Face, ClickHouse, ES, Flink, Spring, Java, Kafka, Monitoring < keywords such as span class="js_darkmode__148"> to view more articles corresponding to keywords.

like + Looking, less bugs 👇