Our monitoring is based on the Prometheus system for transformation, the more important of Prometheus is Metric and Alert, Metric is to indicate that the current or historical reached a certain value, Alert set Metric to reach a specific base to trigger an alarm, but this information is obviously not enough. We all know that the basic unit of Kubernetes is Pod, Pod outputs logs to stdout and stderr, usually have any problems we usually view the relevant logs in the interface or through commands, for example: When the memory of one of our pods becomes large and triggers our Alert, at this time the administrator, go to the page to query to confirm which pod has a problem, and then to confirm the reason why the pod memory becomes larger, we also need to query the pod’s logs, if there is no log system, then we need to go to the page or use the command to query:
Problems
of full-text search
It is not
writing of read and write log data mainly relies on the two components of Distributor and Ingester, and
After refreshing a chunk
oc new-project loki
oc adm policy add-scc-to-user anyuid -z default -n loki
oc adm policy add-cluster-role-to-user cluster-admin system:serviceaccount:loki:default
oc create -f statefulset.json -n loki
: { “apiVersion”: “
apps/v1", "kind"
: "StatefulSet",
"metadata": {
"name": "loki" },"spec": {
"podManagementPolicy": "OrderedReady",
"replicas": 1,
"revisionHistoryLimit": 10,
"selector": {
"matchLabels": {
"app" : "loki" } },"serviceName": "womping-stoat-loki-headless",
"template": {
"metadata" : {
"annotations": {
"checksum/config": "da297d66ee53e0ce68b58e12be7ec5df4a91538c0b476cfe0ed79666343df72b",
"prometheus.io/port": "http-metrics",
"prometheus.io/scrape": "true" },"creationTimestamp": null,
"labels": {
"app": "loki",
"name": "loki" } },"spec": {
"affinity": {},
"containers": [ {"args": [
"-config.file=/etc/loki/local-config.yaml" ],"image" : "grafana/loki:latest",
"imagePullPolicy": "IfNotPresent",
"livenessProbe": {
"failureThreshold": 3,
"httpGet": {
"path": "/ready",
"port": "http-metrics",
"scheme": "HTTP" },"initialDelaySeconds": 45,
"periodSeconds": 10,
"successThreshold": 1,
"timeoutSeconds": 1 },"name": "loki",
"ports" : [ {"containerPort": 3100,
"name": "http-metrics",
"protocol": "TCP" } ],"readinessProbe": {
"failureThreshold": 3,
"httpGet": {
"path": "/ready",
"port": "http-metrics",
"scheme": "HTTP" },"initialDelaySeconds": 45,
"periodSeconds": 10,
"successThreshold": 1,
"timeoutSeconds": 1 },"resources": {},
"terminationMessagePath" : "/dev/termination-log",
"terminationMessagePolicy": "File",
"volumeMounts" : [ {"mountPath": "/tmp/loki",
"name": "storage" } ] } ],"dnsPolicy": "ClusterFirst",
"restartPolicy": "Always",
"schedulerName": "default-scheduler",
"terminationGracePeriodSeconds": 30,
"volumes": [ {"emptyDir": {},
"name": "storage" } ] } },"updateStrategy": {
"type": "RollingUpdate" } }}
oc create -f daemonset.json -n loki
:
{ "
apiVersion": "apps/v1", "
kind": "DaemonSet",
"metadata": {
"annotations": {
"deployment.kubernetes.io/revision": "2" },"creationTimestamp": "2019-09-05T01:16:37Z",
"generation": 2,
"labels": {
"app": "promtail",
"chart": "promtail-0.12.0",
"heritage": "Tiller" ,
"release": "lame-zorse" },"name": "lame-zorse-promtail",
"namespace": "loki" },"spec": {
"progressDeadlineSeconds": 600,
"replicas": 1,
"revisionHistoryLimit": 10,
"selector": {
"matchLabels": {
"app": "promtail",
"release": "lame-zorse" } },"strategy": {
"rollingUpdate": {
"maxSurge": 1,
"maxUnavailable": 1 },"type" : "RollingUpdate"
},"template": {
"metadata": {
"annotations" : {
"checksum/config": "75a25ee4f2869f54d394bf879549a9c89c343981a648f8d878f69bad65dba809",
"prometheus.io/port": "http-metrics",
"prometheus.io/scrape": "true" },"creationTimestamp": null,
"labels": {
"app": "promtail",
"release": "lame-zorse" } },"spec": {
"affinity": {},
"containers": [ {"args": [
"-config.file=/etc/promtail/promtail.yaml",
"-client.url=http://loki.loki.svc:3100/api/prom/push" ],"env": [
{"name": "HOSTNAME",
"valueFrom": {
"fieldRef": {
"apiVersion": "v1",
"fieldPath": "spec.nodeName" } } } ],"image" : "grafana/promtail:v0.3.0",
"imagePullPolicy": "IfNotPresent",
"name": "promtail",
"ports": [ {"containerPort": 3101,
"name": "http-metrics",
"protocol": "TCP" } ],"readinessProbe": {
"failureThreshold" : 5,
"httpGet": {
"path": "/ready",
"port": "http-metrics",
"scheme": "HTTP" },"initialDelaySeconds": 10,
"periodSeconds": 10,
"successThreshold": 1,
"timeoutSeconds": 1 },"resources": {},
"securityContext": {
"readOnlyRootFilesystem": true,
"runAsUser": 0 },"terminationMessagePath": "/dev/termination-log",
"terminationMessagePolicy": "File",
"volumeMounts": [ {"mountPath": "/etc/promtail",
"name": "config" }, {"mountPath": "/run/promtail",
"name": "run" }, {"mountPath": "/var/lib/docker/containers",
"name": "docker",
"readOnly": true }, {"mountPath": "/var/log/pods",
"name": "pods",
"readOnly": true } ] } ],"dnsPolicy": "ClusterFirst",
"restartPolicy": "Always",
"schedulerName": "default-scheduler",
"securityContext": {},
"terminationGracePeriodSeconds": 30,
"volumes": [ {"configMap": {
"defaultMode": 420,
"name": "lame-zorse-promtail" },"name": "config"
}, {"hostPath": {
"path": "/run/promtail",
"type" : "" },"name": "run"
}, {"hostPath": {
"path": "/var/lib/docker/containers",
"type": "" },"name": "docker"
}, {"hostPath": {
"path": "/var/log/pods" ,
"type": "" },"name": "pods"
} ] } } }} Install
oc create -f service.json
-n loki service.json
{
"apiVersion": "v1",
"kind": "Service",
"metadata": {
"creationTimestamp": " 2019-09-04T09:37:49Z",
"name": "loki",
"namespace": "loki",
"resourceVersion": " 17800188",
"selfLink": "/api/v1/namespaces/loki/services/loki",
"uid": "a87fe237-cef7-11e9-b58e-e4a8b6cc47d2" },"spec": {
"externalTrafficPolicy": "Cluster",
"ports": [ {"name": "lokiport",
"port": 3100,
"protocol": "TCP",
"targetPort" : 3100 } ],"selector": {
"app": "loki" },"sessionAffinity": "None",
"type": "NodePort" },"status": {
"loadBalancer": {}
}
curl http://192.168.25.30:30972/api/prom/label{
"values": ["alertmanager", "app", "component", "container_name", "controller_revision_hash", "deployment", "deploymentconfig", "docker_registry", "draft", "filename", "instance", " job", "logging_infra", "metrics_infra", "name", "namespace", "openshift_io_component", "pod_ template_generation", "pod_template_hash", "project", "projectname", "prometheus", "provider", "release", "router", "servicename", "statefulset_kubernetes_io_pod_name", "stream", "tekton_dev_ pipeline", "tekton_dev_pipelineRun", "tekton_dev_pipelineTask", "tekton_dev_task", "tekton_dev_taskRun", " type", "webconsole"]
}
curl http://192.168.25.30:30972/api/prom/label/namespace/values
{"values":["cicd","default","gitlab", "grafanaserver","jenkins","jx-staging","kube-system","loki","mysql-exporter","new2 ","openshift-console","openshift-infra","openshift-logging","openshift-monitoring","openshift-node" ,"openshift-sdn","openshift-web-console","tekton-pipelines","test111"]}
http://192.168.25.30:30972/api/prom/query?direction=BACKWARD&limit=1000®exp=&query={namespace="cicd"}&start=1567644457221000000& end=1567730857221000000&refId=A
-
limit: the number of logs returned
-
Unix time representation method The default is that
-
default is the current time
-
direction: forward or backward, Useful when specifying limit, the default is backward
-
regexp: regex filtering
> query: A query syntax is detailed in the following section, {name=~”mysql.+”} or {namespace=”cicd”} |= “error” indicates that the query, namespace is CI/CD in the log, there is an error word information.
start: start time,
the time end: end time one hour ago, the
of the results
: {app=”mysql”
, name="mysql-backup"}
-
!=: Inequality.
-
=~: Regular expression matching.
-
!~: Do not match regular expressions.
>=: exactly the same.
After
- {
-
job=”mysql”} |= “error”
-
{name=“kafka”} |~ “tsdb-ops.*io:2003”
-
{instance=~“kafka-[23]”,name=“kafka”} != kafka.server:type=ReplicaManager
- {
-
job=”mysql”} |= “error” != “timeout”
-
|= line contains strings.
-
!= line does not contain strings.
-
| ~ line matches regular expressions.
-
!~ line does not match the regular expression.
end
public number (zhisheng) reply to Face, ClickHouse, ES, Flink, Spring, Java, Kafka, Monitoring < keywords such as span class="js_darkmode__148"> to view more articles corresponding to keywords.
like + Looking, less bugs 👇