Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: bash standalone_embed.sh start fail maybe due to "error":"etcdserver: no space" #36748

Open
1 task done
HWZhang1234 opened this issue Oct 10, 2024 · 21 comments
Open
1 task done
Assignees
Labels
help wanted Extra attention is needed stale indicates no udpates for 30 days

Comments

@HWZhang1234
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version:v2.4.6-gpu
- Deployment mode(standalone or cluster):standalone 
- MQ type(rocksmq, pulsar or kafka):    
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

When I run bash standalone_embed.sh start.It will start fail.And I check docker images,I can find milvus-standalone name,but it exited.

Expected Behavior

When I run bash standalone_embed.sh start and check from doker images.It should show healthy

Steps To Reproduce

Bleow is my standalone_embed.sh
#!/usr/bin/env bash

# Licensed to the LF AI & Data foundation under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

run_embed() {
    cat << EOF > embedEtcd.yaml
listen-client-urls: http://0.0.0.0:2379
advertise-client-urls: http://0.0.0.0:2379
EOF

    sudo docker run -d \
        --name milvus-standalone --gpus all\
        --security-opt seccomp:unconfined \
        -e ETCD_USE_EMBED=true \
        -e ETCD_DATA_DIR=/var/lib/milvus/etcd \
        -e ETCD_CONFIG_PATH=/milvus/configs/embedEtcd.yaml \
        -e COMMON_STORAGETYPE=local \
        -v $(pwd)/volumes/milvus:/var/lib/milvus \
        -v $(pwd)/embedEtcd.yaml:/milvus/configs/embedEtcd.yaml \
		-v $(pwd)/milvus.yaml:/milvus/configs/milvus.yaml \
        -p 19530:19530 \
        -p 9091:9091 \
        -p 2379:2379 \
        --health-cmd="curl -f http://localhost:9091/healthz" \
        --health-interval=30s \
        --health-start-period=90s \
        --health-timeout=20s \
        --health-retries=3 \
        milvusdb/milvus:v2.4.6-gpu \
        milvus run standalone 1> /dev/null
}

wait_for_milvus_running() {
    echo "Wait for Milvus Starting..."
    while true
    do
        res=`sudo docker ps|grep milvus-standalone|grep healthy|wc -l`
        if [ $res -eq 1 ]
        then
            echo "Start successfully."
            break
        fi
        sleep 1
    done
}

start() {
    res=`sudo docker ps|grep milvus-standalone|grep healthy|wc -l`
    if [ $res -eq 1 ]
    then
        echo "Milvus is running."
        exit 0
    fi

    res=`sudo docker ps -a|grep milvus-standalone|wc -l`
    if [ $res -eq 1 ]
    then
		echo "sudo docker start milvus-standalone 1> /dev/null"
        sudo docker start milvus-standalone 1> /dev/null
    else
		echo "run_embed"
        run_embed
    fi

    if [ $? -ne 0 ]
    then
        echo "Start failed."
        exit 1
    fi

    wait_for_milvus_running
}

stop() {
    sudo docker stop milvus-standalone 1> /dev/null

    if [ $? -ne 0 ]
    then
        echo "Stop failed."
        exit 1
    fi
    echo "Stop successfully."

}

delete() {
    res=`sudo docker ps|grep milvus-standalone|wc -l`
    if [ $res -eq 1 ]
    then
        echo "Please stop Milvus service before delete."
        exit 1
    fi
    sudo docker rm milvus-standalone 1> /dev/null
    if [ $? -ne 0 ]
    then
        echo "Delete failed."
        exit 1
    fi
    sudo rm -rf $(pwd)/volumes
    sudo rm -rf $(pwd)/embedEtcd.yaml
    echo "Delete successfully."
}


case $1 in
    start)
        start
        ;;
    stop)
        stop
        ;;
    delete)
        delete
        ;;
    *)
        echo "please use bash standalone_embed.sh start|stop|delete"
        ;;
esac

Milvus Log

[2024/10/10 08:13:38.171 +00:00] [INFO] [distance/calc_distance_amd64.go:14] ["Hook avx for go simd distance computation"]
2024/10/10 08:13:38 maxprocs: Leaving GOMAXPROCS=40: CPU quota undefined

__  _________ _   ____  ______

/ |/ / / /| | / / / / / __/
/ /|
/ // // /| |/ / // /\
// /////_/__/

Welcome to use Milvus!
Version: v2.4.6-gpu
Built: Tue Jul 23 14:19:56 UTC 2024
GitCommit: ce9ab52
GoVersion: go version go1.21.10 linux/amd64

TotalMem: 66990850048
UsedMem: 74571776

open pid file: /run/milvus/standalone.pid
lock pid file: /run/milvus/standalone.pid
[2024/10/10 08:13:38.221 +00:00] [INFO] [roles/roles.go:307] ["starting running Milvus components"]
[2024/10/10 08:13:38.221 +00:00] [INFO] [roles/roles.go:170] ["Enable Jemalloc"] ["Jemalloc Path"=/milvus/lib/libjemalloc.so]
[2024/10/10 08:13:38.233 +00:00] [DEBUG] [config/refresher.go:67] ["start refreshing configurations"] [source=FileSource]
[2024/10/10 08:13:38.241 +00:00] [INFO] [paramtable/hook_config.go:21] ["hook config"] [hook={}]
[2024/10/10 08:13:38.241 +00:00] [DEBUG] [config/refresher.go:67] ["start refreshing configurations"] [source=FileSource]
{"level":"info","ts":"2024-10-10T08:13:38.242Z","caller":"embed/etcd.go:124","msg":"configuring peer listeners","listen-peer-urls":["http://localhost:2380"]}
{"level":"info","ts":"2024-10-10T08:13:38.242Z","caller":"embed/etcd.go:132","msg":"configuring client listeners","listen-client-urls":["http://0.0.0.0:2379"]}
{"level":"info","ts":"2024-10-10T08:13:38.242Z","caller":"embed/etcd.go:306","msg":"starting an etcd server","etcd-version":"3.5.5","git-sha":"Not provided (use ./build instead of go build)","go-version":"go1.21.10","go-os":"linux","go-arch":"amd64","max-cpu-set":40,"max-cpu-available":40,"member-initialized":true,"name":"default","data-dir":"/var/lib/milvus/etcd","wal-dir":"","wal-dir-dedicated":"","member-dir":"/var/lib/milvus/etcd/member","force-new-cluster":false,"heartbeat-interval":"100ms","election-timeout":"1s","initial-election-tick-advance":true,"snapshot-count":100000,"snapshot-catchup-entries":5000,"initial-advertise-peer-urls":["http://localhost:2380"],"listen-peer-urls":["http://localhost:2380"],"advertise-client-urls":["http://0.0.0.0:2379"],"listen-client-urls":["http://0.0.0.0:2379"],"listen-metrics-urls":[],"cors":[""],"host-whitelist":[""],"initial-cluster":"","initial-cluster-state":"new","initial-cluster-token":"","quota-backend-bytes":2147483648,"max-request-bytes":1572864,"max-concurrent-streams":4294967295,"pre-vote":true,"initial-corrupt-check":false,"corrupt-check-time-interval":"0s","compact-check-time-enabled":false,"compact-check-time-interval":"1m0s","auto-compaction-mode":"","auto-compaction-retention":"0s","auto-compaction-interval":"0s","discovery-url":"","discovery-proxy":"","downgrade-check-interval":"5s"}
{"level":"info","ts":"2024-10-10T08:13:44.250Z","caller":"etcdserver/backend.go:81","msg":"opened backend db","path":"/var/lib/milvus/etcd/member/snap/db","took":"6.007168599s"}
{"level":"info","ts":"2024-10-10T08:13:45.385Z","caller":"etcdserver/server.go:509","msg":"recovered v2 store from snapshot","snapshot-index":3300033,"snapshot-size":"8.8 kB"}
{"level":"info","ts":"2024-10-10T08:13:45.385Z","caller":"etcdserver/server.go:522","msg":"recovered v3 backend from snapshot","backend-size-bytes":2147467264,"backend-size":"2.1 GB","backend-size-in-use-bytes":2147446784,"backend-size-in-use":"2.1 GB"}
{"level":"info","ts":"2024-10-10T08:13:45.708Z","caller":"etcdserver/raft.go:529","msg":"restarting local member","cluster-id":"cdf818194e3a8c32","local-member-id":"8e9e05c52164694d","commit-index":3388434}
{"level":"info","ts":"2024-10-10T08:13:45.710Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"8e9e05c52164694d switched to configuration voters=(10276657743932975437)"}
{"level":"info","ts":"2024-10-10T08:13:45.710Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"8e9e05c52164694d became follower at term 16"}
{"level":"info","ts":"2024-10-10T08:13:45.710Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"newRaft 8e9e05c52164694d [peers: [8e9e05c52164694d], term: 16, commit: 3388434, applied: 3300033, lastindex: 3388434, lastterm: 16]"}
{"level":"info","ts":"2024-10-10T08:13:45.711Z","caller":"api/capability.go:75","msg":"enabled capabilities for version","cluster-version":"3.5"}
{"level":"info","ts":"2024-10-10T08:13:45.711Z","caller":"membership/cluster.go:278","msg":"recovered/added member from store","cluster-id":"cdf818194e3a8c32","local-member-id":"8e9e05c52164694d","recovered-remote-peer-id":"8e9e05c52164694d","recovered-remote-peer-urls":["http://localhost:2380"]}
{"level":"info","ts":"2024-10-10T08:13:45.711Z","caller":"membership/cluster.go:287","msg":"set cluster version from store","cluster-version":"3.5"}
{"level":"warn","ts":"2024-10-10T08:13:45.719Z","caller":"auth/store.go:1233","msg":"simple token is not cryptographically signed"}
{"level":"info","ts":"2024-10-10T08:13:53.505Z","caller":"mvcc/kvstore.go:393","msg":"kvstore restored","current-rev":3297900}
{"level":"info","ts":"2024-10-10T08:13:53.513Z","caller":"etcdserver/quota.go:94","msg":"enabled backend quota with default value","quota-name":"v3-applier","quota-size-bytes":2147483648,"quota-size":"2.1 GB"}
{"level":"info","ts":"2024-10-10T08:13:53.517Z","caller":"etcdserver/server.go:845","msg":"starting etcd server","local-member-id":"8e9e05c52164694d","local-server-version":"3.5.5","cluster-id":"cdf818194e3a8c32","cluster-version":"3.5"}
{"level":"info","ts":"2024-10-10T08:13:53.518Z","caller":"etcdserver/server.go:738","msg":"started as single-node; fast-forwarding election ticks","local-member-id":"8e9e05c52164694d","forward-ticks":9,"forward-duration":"900ms","election-ticks":10,"election-timeout":"1s"}
{"level":"info","ts":"2024-10-10T08:13:53.525Z","caller":"embed/etcd.go:584","msg":"serving peer traffic","address":"127.0.0.1:2380"}
{"level":"info","ts":"2024-10-10T08:13:53.525Z","caller":"embed/etcd.go:556","msg":"cmux::serve","address":"127.0.0.1:2380"}
{"level":"info","ts":"2024-10-10T08:13:53.525Z","caller":"embed/etcd.go:275","msg":"now serving peer/client/metrics","local-member-id":"8e9e05c52164694d","initial-advertise-peer-urls":["http://localhost:2380"],"listen-peer-urls":["http://localhost:2380"],"advertise-client-urls":["http://0.0.0.0:2379"],"listen-client-urls":["http://0.0.0.0:2379"],"listen-metrics-urls":[]}
[2024/10/10 08:13:53.525 +00:00] [INFO] [etcd/etcd_server.go:58] ["finish init Etcd config"] [path=/milvus/configs/embedEtcd.yaml] [data=/var/lib/milvus/etcd]
[2024/10/10 08:13:53.525 +00:00] [INFO] [roles/roles.go:256] [setupPrometheusHTTPServer]
[2024/10/10 08:13:53.525 +00:00] [INFO] [http/server.go:152] ["management listen"] [addr=:9091]
[2024/10/10 08:13:53.526 +00:00] [INFO] [rootcoord/root_coord.go:154] ["update rootcoord state"] [state=Abnormal]
[2024/10/10 08:13:53.526 +00:00] [DEBUG] [rootcoord/service.go:184] ["init params done.."]
[2024/10/10 08:13:53.526 +00:00] [INFO] [etcd/etcd_util.go:49] ["create etcd client"] [useEmbedEtcd=true] [useSSL=false] [endpoints="[localhost:2379]"] [minVersion=1.3]
[2024/10/10 08:13:53.526 +00:00] [DEBUG] [rootcoord/service.go:204] ["etcd connect done ..."]
[2024/10/10 08:13:53.526 +00:00] [DEBUG] [rootcoord/service.go:264] ["start grpc "] [port=53100]
[2024/10/10 08:13:53.526 +00:00] [INFO] [etcd/etcd_util.go:49] ["create etcd client"] [useEmbedEtcd=true] [useSSL=false] [endpoints="[localhost:2379]"] [minVersion=1.3]
[2024/10/10 08:13:53.526 +00:00] [DEBUG] [datacoord/service.go:150] ["network port"] [port=13333]
[2024/10/10 08:13:53.526 +00:00] [INFO] [components/index_coord.go:38] ["IndexCoord running ..."]
[2024/10/10 08:13:53.527 +00:00] [INFO] [etcd/etcd_util.go:49] ["create etcd client"] [useEmbedEtcd=true] [useSSL=false] [endpoints="[localhost:2379]"] [minVersion=1.3]
[2024/10/10 08:13:53.527 +00:00] [DEBUG] [querycoord/service.go:218] [network] [port=19531]
[2024/10/10 08:13:53.527 +00:00] [DEBUG] [querynode/service.go:104] [QueryNode] [port=21123]
[2024/10/10 08:13:53.527 +00:00] [INFO] [etcd/etcd_util.go:49] ["create etcd client"] [useEmbedEtcd=true] [useSSL=false] [endpoints="[localhost:2379]"] [minVersion=1.3]
[2024/10/10 08:13:53.527 +00:00] [DEBUG] [querynode/service.go:124] ["QueryNode connect to etcd successfully"]
[2024/10/10 08:13:53.528 +00:00] [INFO] [etcd/etcd_util.go:49] ["create etcd client"] [useEmbedEtcd=true] [useSSL=false] [endpoints="[localhost:2379]"] [minVersion=1.3]
[2024/10/10 08:13:53.528 +00:00] [DEBUG] [indexnode/indexnode.go:115] ["New IndexNode ..."]
[2024/10/10 08:13:53.528 +00:00] [INFO] [datanode/service.go:256] ["DataNode address"] [address=172.17.0.3:21124]
[2024/10/10 08:13:53.528 +00:00] [INFO] [datanode/service.go:257] ["DataNode serverID"] [serverID=0]
[2024/10/10 08:13:53.528 +00:00] [DEBUG] [indexnode/service.go:87] [IndexNode] ["network address"=172.17.0.3:21121] ["network port: "=21121]
[2024/10/10 08:13:53.528 +00:00] [INFO] [proxy/lb_policy.go:78] ["use look_aside policy on replica selection"]
[2024/10/10 08:13:53.528 +00:00] [INFO] [runtime/asm_amd64.s:1650] ["Start check query node health loop"]
[2024/10/10 08:13:53.528 +00:00] [DEBUG] [proxy/simple_rate_limiter.go:225] ["RateLimiter register for rateType"] [rateType=DDLIndex] [rateLimit=+inf] [burst=1.7976931348623157e+308]
[2024/10/10 08:13:53.528 +00:00] [DEBUG] [proxy/simple_rate_limiter.go:225] ["RateLimiter register for rateType"] [rateType=DQLSearch] [rateLimit=+inf] [burst=1.7976931348623157e+308]
[2024/10/10 08:13:53.529 +00:00] [INFO] [hookutil/hook.go:46] ["empty so path, skip to load plugin"]
[2024/10/10 08:13:53.529 +00:00] [DEBUG] [proxy/service.go:122] ["create a new Proxy instance"] [state=2]
[2024/10/10 08:13:53.529 +00:00] [DEBUG] [proxy/service.go:423] ["init Proxy server"]
[2024/10/10 08:13:53.529 +00:00] [DEBUG] [proxy/service.go:454] ["Proxy init service's parameter table done"]
[2024/10/10 08:13:53.529 +00:00] [DEBUG] [proxy/service.go:456] ["Proxy init http server's parameter table done"]
[2024/10/10 08:13:53.529 +00:00] [DEBUG] [proxy/service.go:463] ["init Proxy's parameter table done"] [internalAddress=172.17.0.3:19529] [externalAddress=172.17.0.3:19530]
[2024/10/10 08:13:53.529 +00:00] [INFO] [accesslog/global.go:145] ["Init access logger success"]
[2024/10/10 08:13:53.529 +00:00] [DEBUG] [proxy/service.go:470] ["init Proxy's tracer done"] ["service name"="Proxy ip: 172.17.0.3, port: 19530"]
[2024/10/10 08:13:53.529 +00:00] [INFO] [etcd/etcd_util.go:49] ["create etcd client"] [useEmbedEtcd=true] [useSSL=false] [endpoints="[localhost:2379]"] [minVersion=1.3]
[2024/10/10 08:13:53.529 +00:00] [INFO] [proxy/service.go:370] ["Proxy internal server listen on tcp"] [port=19529]
[2024/10/10 08:13:53.529 +00:00] [INFO] [proxy/service.go:377] ["Proxy internal server already listen on tcp"] [port=19529]
[2024/10/10 08:13:53.530 +00:00] [INFO] [proxy/service.go:502] ["Proxy server listen on tcp"] [port=19530]
[2024/10/10 08:13:53.530 +00:00] [INFO] [proxy/service.go:409] ["create Proxy internal grpc server"] ["enforcement policy"="{"MinTime":5000000000,"PermitWithoutStream":true}"] ["server parameters"="{"MaxConnectionIdle":0,"MaxConnectionAge":0,"MaxConnectionAgeGrace":0,"Time":60000000000,"Timeout":10000000000}"]
[2024/10/10 08:13:53.530 +00:00] [INFO] [proxy/service.go:505] ["Proxy server already listen on tcp"] [port=19530]
[2024/10/10 08:13:53.530 +00:00] [DEBUG] [proxy/service.go:266] ["Get proxy rate limiter done"] [port=19530]
[2024/10/10 08:13:53.530 +00:00] [DEBUG] [proxy/service.go:345] ["create Proxy grpc server"] ["enforcement policy"="{"MinTime":5000000000,"PermitWithoutStream":true}"] ["server parameters"="{"MaxConnectionIdle":0,"MaxConnectionAge":0,"MaxConnectionAgeGrace":0,"Time":60000000000,"Timeout":10000000000}"]
[2024/10/10 08:13:53.530 +00:00] [INFO] [proxy/service.go:586] ["register Proxy http server"]
[2024/10/10 08:13:53.530 +00:00] [DEBUG] [proxy/service.go:593] ["create RootCoord client for Proxy"]
[2024/10/10 08:13:53.530 +00:00] [INFO] [etcd/etcd_util.go:49] ["create etcd client"] [useEmbedEtcd=true] [useSSL=false] [endpoints="[localhost:2379]"] [minVersion=1.3]
[2024/10/10 08:13:53.531 +00:00] [DEBUG] [sessionutil/session_util.go:257] ["Session try to connect to etcd"]
{"level":"info","ts":"2024-10-10T08:13:53.531Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"8e9e05c52164694d no leader at term 16; dropping index reading msg"}
{"level":"info","ts":"2024-10-10T08:13:53.614Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"8e9e05c52164694d is starting a new election at term 16"}
{"level":"info","ts":"2024-10-10T08:13:53.614Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"8e9e05c52164694d became pre-candidate at term 16"}
{"level":"info","ts":"2024-10-10T08:13:53.614Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"8e9e05c52164694d received MsgPreVoteResp from 8e9e05c52164694d at term 16"}
{"level":"info","ts":"2024-10-10T08:13:53.614Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"8e9e05c52164694d became candidate at term 17"}
{"level":"info","ts":"2024-10-10T08:13:53.614Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"8e9e05c52164694d received MsgVoteResp from 8e9e05c52164694d at term 17"}
{"level":"info","ts":"2024-10-10T08:13:53.614Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"8e9e05c52164694d became leader at term 17"}
{"level":"info","ts":"2024-10-10T08:13:53.614Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"raft.node: 8e9e05c52164694d elected leader 8e9e05c52164694d at term 17"}
[2024/10/10 08:13:53.617 +00:00] [WARN] [sessionutil/session_util.go:266] ["retry func failed"] [retried=0] [error="etcdserver: leader changed"]
[2024/10/10 08:13:53.627 +00:00] [DEBUG] [rootcoord/service.go:221] ["grpc init done ..."]
[2024/10/10 08:13:53.627 +00:00] [DEBUG] [rootcoord/service.go:224] ["RootCoord start to create DataCoord client"]
[2024/10/10 08:13:53.627 +00:00] [DEBUG] [sessionutil/session_util.go:257] ["Session try to connect to etcd"]
[2024/10/10 08:13:53.627 +00:00] [DEBUG] [sessionutil/session_util.go:257] ["Session try to connect to etcd"]
[2024/10/10 08:13:53.627 +00:00] [INFO] [dependency/factory.go:86] ["try to init mq"] [standalone=true] [mqType=rocksmq]
[2024/10/10 08:13:53.627 +00:00] [DEBUG] [server/global_rmq.go:39] ["initializing global rmq"] [path=/var/lib/milvus/rdb_data]
[2024/10/10 08:13:53.627 +00:00] [DEBUG] [querynode/service.go:134] [QueryNode] [State=Initializing]
[2024/10/10 08:13:53.628 +00:00] [INFO] [querynodev2/server.go:286] ["QueryNode session info"] [metaPath=by-dev/meta]
[2024/10/10 08:13:53.628 +00:00] [DEBUG] [sessionutil/session_util.go:257] ["Session try to connect to etcd"]
[2024/10/10 08:13:53.628 +00:00] [INFO] [datanode/service.go:266] ["initializing RootCoord client for DataNode"]
[2024/10/10 08:13:53.628 +00:00] [DEBUG] [sessionutil/session_util.go:257] ["Session try to connect to etcd"]
[2024/10/10 08:13:53.628 +00:00] [INFO] [etcd/etcd_util.go:49] ["create etcd client"] [useEmbedEtcd=true] [useSSL=false] [endpoints="[localhost:2379]"] [minVersion=1.3]
[2024/10/10 08:13:53.629 +00:00] [INFO] [indexnode/indexnode.go:207] ["IndexNode init"] [state=Initializing]
[2024/10/10 08:13:53.629 +00:00] [DEBUG] [sessionutil/session_util.go:257] ["Session try to connect to etcd"]
[2024/10/10 08:13:53.629 +00:00] [DEBUG] [server/rocksmq_impl.go:177] ["Start rocksmq"] ["max proc"=40] [parallism=4] ["lru cache"=4019451002]
{"level":"warn","ts":"2024-10-10T08:13:53.661Z","caller":"etcdserver/util.go:123","msg":"failed to apply request","took":"3.839µs","request":"header:<ID:7587881975673440009 > txn:<compare:<key:"by-dev/meta/session/id" version:0 > success:<request_put:<key:"by-dev/meta/session/id" value_size:1 >> failure:<>>","response":"","error":"etcdserver: no space"}

Anything else?

If I should increace memoryLimit?

@HWZhang1234 HWZhang1234 added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Oct 10, 2024
@HWZhang1234
Copy link
Author

milvusLog.txt

@xiaofan-luan
Copy link
Collaborator

on you host machine, you need a directory $(pwd)/volume and it need to has enough space(usually more than 10GB to start)

@yanliang567
Copy link
Contributor

/assign @HWZhang1234
BTW, it is recommended to have ssd volumes for etcd service.

/unassign

@yanliang567 yanliang567 added help wanted Extra attention is needed and removed kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Oct 11, 2024
@HWZhang1234
Copy link
Author

I have 79GB for directory $(pwd)/volume.
I want to know if below log is fail point.If I can increase backend-size-bytes {"level":"info","ts":"2024-10-11T02:19:10.992Z","caller":"etcdserver/server.go:522","msg":"recovered v3 backend from snapshot","backend-size-bytes":2147467264,"backend-size":"2.1 GB","backend-size-in-use-bytes":
2147446784,"backend-size-in-use":"2.1 GB"}

@HWZhang1234
Copy link
Author

Add one more information.It can run successfulli before and I have used if for two month.Just yesteday,it fail to start.

@yanliang567
Copy link
Contributor

/assign @LoveEachDay
any ideas?

@HWZhang1234
Copy link
Author

HWZhang1234 commented Oct 11, 2024

And if there is any way we can increase etcd size in standalone_embed.sh like below.
sudo docker run -d
--name milvus-standalone --gpus all
--security-opt seccomp:unconfined
-e ETCD_USE_EMBED=true
-e ETCD_DATA_DIR=/var/lib/milvus/etcd
-e ETCD_CONFIG_PATH=/milvus/configs/embedEtcd.yaml
-e COMMON_STORAGETYPE=local
-v $(pwd)/volumes/milvus:/var/lib/milvus
-v $(pwd)/embedEtcd.yaml:/milvus/configs/embedEtcd.yaml
-v $(pwd)/milvus.yaml:/milvus/configs/milvus.yaml
-v $(pwd)/user.yaml:/milvus/configs/user.yaml
-p 19530:19530
-p 9091:9091
-p 2379:2379
--health-cmd="curl -f http://localhost:9091/healthz"
--health-interval=30s
--health-start-period=90s
--health-timeout=20s
--health-retries=3
milvusdb/milvus:v2.4.6-gpu
milvus run standalone 1> /dev/null

@HWZhang1234
Copy link
Author

[2024/10/11 02:45:08.163 +00:00] [WARN] [sessionutil/session_util.go:365] ["Session Txn failed"] [key=id] [error="etcdserver: mvcc: database space exceeded"]
panic: etcdserver: mvcc: database space exceeded

@xiaofan-luan
Copy link
Collaborator

backend-size 2.1 GB

that means you use huge amount of etcd space.
How much data do you have?

@zwd1208 should we tune the etcd compaction policy?

@HWZhang1234
Copy link
Author

I have about 80G data

@LoveEachDay
Copy link
Contributor

@HWZhang1234 You've changed the default embedding etcd configuration manually, which disabled the compaction.
Here the your config:

    cat << EOF > embedEtcd.yaml
listen-client-urls: http://0.0.0.0:2379
advertise-client-urls: http://0.0.0.0:2379
EOF

Could you change the standalone_embed.sh add the following three lines like this:

    cat << EOF > embedEtcd.yaml
listen-client-urls: http://0.0.0.0:2379
advertise-client-urls: http://0.0.0.0:2379
quota-backend-bytes: 4294967296
auto-compaction-mode: revision
auto-compaction-retention: '1000'
EOF

Then try to stop and start again like this:

bash standalone_embed.sh stop
bash standalone_embed.sh start

@HWZhang1234
Copy link
Author

Yes.I have try the parametr you shared today.But it still report same error.
https://raw.githubusercontent.com/milvus-io/milvus/master/scripts/standalone_embed.sh

@LoveEachDay
Copy link
Contributor

@HWZhang1234 Could you use the following command to export the milvus container log:

docker logs milvus-standalone > milvus.log

It seems the embedded etcd configuration is not correctly setted.

You can see some logs like this:

{"level":"info","ts":"2024-10-11T06:23:20.927Z","caller":"embed/etcd.go:306","msg":"starting an etcd server","etcd-version":"3.5.5","git-sha":"Not provided (use ./build instead of go build)","go-version":"go1.20.7","go-os":"linux","go-arch":"arm64","max-cpu-set":4,"max-cpu-available":4,"member-initialized":false,"name":"default","data-dir":"/var/lib/milvus/etcd","wal-dir":"","wal-dir-dedicated":"","member-dir":"/var/lib/milvus/etcd/member","force-new-cluster":false,"heartbeat-interval":"100ms","election-timeout":"1s","initial-election-tick-advance":true,"snapshot-count":100000,"snapshot-catchup-entries":5000,"initial-advertise-peer-urls":["http://localhost:2380"],"listen-peer-urls":["http://localhost:2380"],"advertise-client-urls":["http://0.0.0.0:2379"],"listen-client-urls":["http://0.0.0.0:2379"],"listen-metrics-urls":[],"cors":["*"],"host-whitelist":["*"],"initial-cluster":"default=http://localhost:2380","initial-cluster-state":"new","initial-cluster-token":"etcd-cluster","quota-backend-bytes":4294967296,"max-request-bytes":1572864,"max-concurrent-streams":4294967295,"pre-vote":true,"initial-corrupt-check":false,"corrupt-check-time-interval":"0s","compact-check-time-enabled":false,"compact-check-time-interval":"1m0s","auto-compaction-mode":"revision","auto-compaction-retention":"1µs","auto-compaction-interval":"1µs","discovery-url":"","discovery-proxy":"","downgrade-check-interval":"5s"}

@HWZhang1234
Copy link
Author

@LoveEachDay Thanks for your support.Please check milvus log and my standalone_embed.sh.
Milvus_log.txt
#!/usr/bin/env bash

Licensed to the LF AI & Data foundation under one

or more contributor license agreements. See the NOTICE file

distributed with this work for additional information

regarding copyright ownership. The ASF licenses this file

to you under the Apache License, Version 2.0 (the

"License"); you may not use this file except in compliance

with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software

distributed under the License is distributed on an "AS IS" BASIS,

WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

See the License for the specific language governing permissions and

limitations under the License.

run_embed() {
cat << EOF > embedEtcd.yaml
listen-client-urls: http://0.0.0.0:2379
advertise-client-urls: http://0.0.0.0:2379
quota-backend-bytes: 4294967296
auto-compaction-mode: revision
auto-compaction-retention: '1000'
EOF

cat << EOF > user.yaml

Extra config to override default milvus.yaml

EOF

sudo docker run -d \
    --name milvus-standalone \
    --security-opt seccomp:unconfined \
    -e ETCD_USE_EMBED=true \
    -e ETCD_DATA_DIR=/var/lib/milvus/etcd \
    -e ETCD_CONFIG_PATH=/milvus/configs/embedEtcd.yaml \
    -e COMMON_STORAGETYPE=local \
    -v $(pwd)/volumes/milvus:/var/lib/milvus \
    -v $(pwd)/embedEtcd.yaml:/milvus/configs/embedEtcd.yaml \
    -v $(pwd)/user.yaml:/milvus/configs/user.yaml \
    -p 19530:19530 \
    -p 9091:9091 \
    -p 2379:2379 \
    --health-cmd="curl -f http://localhost:9091/healthz" \
    --health-interval=30s \
    --health-start-period=90s \
    --health-timeout=20s \
    --health-retries=3 \
    milvusdb/milvus:v2.4.6-gpu \
    milvus run standalone  1> /dev/null

}

wait_for_milvus_running() {
echo "Wait for Milvus Starting..."
while true
do
res=sudo docker ps|grep milvus-standalone|grep healthy|wc -l
if [ $res -eq 1 ]
then
echo "Start successfully."
echo "To change the default Milvus configuration, add your settings to the user.yaml file and then restart the service."
break
fi
sleep 1
done
}

start() {
res=sudo docker ps|grep milvus-standalone|grep healthy|wc -l
if [ $res -eq 1 ]
then
echo "Milvus is running."
exit 0
fi

res=`sudo docker ps -a|grep milvus-standalone|wc -l`
if [ $res -eq 1 ]
then
    sudo docker start milvus-standalone 1> /dev/null
else
    run_embed
fi

if [ $? -ne 0 ]
then
    echo "Start failed."
    exit 1
fi

wait_for_milvus_running

}

stop() {
sudo docker stop milvus-standalone 1> /dev/null

if [ $? -ne 0 ]
then
    echo "Stop failed."
    exit 1
fi
echo "Stop successfully."

}

delete() {
res=sudo docker ps|grep milvus-standalone|wc -l
if [ $res -eq 1 ]
then
echo "Please stop Milvus service before delete."
exit 1
fi
sudo docker rm milvus-standalone 1> /dev/null
if [ $? -ne 0 ]
then
echo "Delete failed."
exit 1
fi
sudo rm -rf $(pwd)/volumes
sudo rm -rf $(pwd)/embedEtcd.yaml
sudo rm -rf $(pwd)/user.yaml
echo "Delete successfully."
}

case $1 in
restart)
stop
start
;;
start)
start
;;
stop)
stop
;;
delete)
delete
;;
*)
echo "please use bash standalone_embed.sh restart|start|stop|delete"
;;
esac

@HWZhang1234
Copy link
Author

BTW.If any issues.You can call me at 13255291129.Or can you add my wechat 13255291129.Thanks for your support

@HWZhang1234
Copy link
Author

Looks like parameter "quota-backend-bytes: 4294967296" doesn't work

@HWZhang1234
Copy link
Author

Add one more milvus.log.
milvus.log

@LoveEachDay
Copy link
Contributor

@HWZhang1234 The attached log is incomplete. Lack of the startup log for embedding etcd.

@HWZhang1234
Copy link
Author

milvus.log
I re-capture the log and looks nearly same as before.Could you please help check it.

@HWZhang1234
Copy link
Author

I have another question.I download a new version milvus and it also fail to start.No any data in this new version.Why it still fail to satart.I get milvus log.Could you please help check it.
milvus_1.log

Copy link

stale bot commented Nov 16, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

@stale stale bot added the stale indicates no udpates for 30 days label Nov 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed stale indicates no udpates for 30 days
Projects
None yet
Development

No branches or pull requests

4 participants