shiva的tableserver启动失败


#1

Mon Jan 15 10:12:17 CST 2018 [Manager] Starting task local part …
target yaml path on [Manager] is /var/lib/transwarp-manager/master/content/resources/services/shiva1/shiva-tabletserver.yaml
start to generate shiva-tabletserver.yaml on [Manager]…
start handle local part, dataModel is
Map(transwarpRepo -> http://192.168.110.45:8180/pub/transwarp, current.user -> root, service -> Map(tabletserver.container.limits.cpu -> -1, keytab -> /etc/shiva1/conf/shiva.keytab, tabletserver.recover.recover_data_max_concurrent -> 4, tabletserver.raft.heartbeat_period_ms -> 5000, master.log.log_file_num -> 10, master.raft.max_replicate_buffer_size_mb -> 16, master.raft.disruption_threshold -> 3, tabletserver.rpc_service.manage_service_thread_num -> 1, master.memory.ratio -> -1, tabletserver.conf -> Map(tabletserver.recover.recover_data_max_concurrent -> 4, tabletserver.raft.heartbeat_period_ms -> 5000, tabletserver.rpc_service.manage_service_thread_num -> 1, tabletserver.recover.recover_rpc_call_timeout_ms -> 10000, tabletserver.rpc_service.recover_service_thread_num -> 4, tabletserver.log.log_max_size -> 1024, tabletserver.tabletserver.topology_conf -> /etc/shiva1/conf/topology.conf, tabletserver.cache.kv_block_cache -> 10, tabletserver.raft.max_replicate_buffer_size_mb -> 64, tabletserver.thread_pool.tablet_reader_thread_num -> 8, tabletserver.tabletserver.store_conf -> /etc/shiva1/conf/store.conf, tabletserver.rpc_client.rpc_timeout_ms -> 10000, tabletserver.store.disk_write_thread_num -> 1, tabletserver.log.log_file_num -> 10, tabletserver.raft.disruption_threshold -> 3, tabletserver.rpc_client.rpc_callback_thread_num -> 4, tabletserver.rpc_service.tablet_service_thread_num -> 8, tabletserver.raft.max_batch_size_b -> 2097152, tabletserver.rpc_client.rpc_work_thread_num -> 4, tabletserver.log.log_dir -> /var/log/shiva1, tabletserver.log.log_level -> 5, tabletserver.raft.election_threshold -> 5, tabletserver.rpc_service.raft_service_thread_num -> 4, tabletserver.tabletserver.max_tablet_reader_queue_size -> 3000, tabletserver.rpc_service.manage_service_port -> 8002, tabletserver.recover.recover_data_buffer_size_mb -> 4), plugins -> List(), master.recover.recover_data_max_concurrent -> 2, topology.topology.rack -> rack1, tabletserver.container.limits.memory -> -1, tabletserver.memory.ratio -> -1, tabletserver.recover.recover_rpc_call_timeout_ms -> 10000, tabletserver.rpc_service.recover_service_thread_num -> 4, tabletserver.log.log_max_size -> 1024, webserver.container.limits.memory -> -1, tabletserver.tabletserver.topology_conf -> /etc/shiva1/conf/topology.conf, auth -> kerberos, tabletserver.cache.kv_block_cache -> 10, master.rpc_service.recover_service_thread_num -> 1, tabletserver.raft.max_replicate_buffer_size_mb -> 64, tabletserver.thread_pool.tablet_reader_thread_num -> 8, master.scheduler.bulk_lease_timeout_s -> 600, master.conf -> Map(master.log.log_file_num -> 10, master.raft.max_replicate_buffer_size_mb -> 16, master.raft.disruption_threshold -> 3, master.recover.recover_data_max_concurrent -> 2, master.rpc_service.recover_service_thread_num -> 1, master.scheduler.bulk_lease_timeout_s -> 600, master.rpc_client.rpc_timeout_ms -> 10000, master.rpc_client.rpc_work_thread_num -> 8, master.rpc_service.master_service_thread_num -> 1, master.log.log_max_size -> 1024, master.rpc_service.monitor_service_thread_num -> 8, master.log.log_level -> 5, master.raft.heartbeat_period_ms -> 5000, master.raft.election_threshold -> 5, master.rpc_service.raft_service_thread_num -> 2, master.log.log_dir -> /var/log/shiva1, master.scheduler.dead_server_timeout_s -> 60, master.scheduler.bulk_gc_period_s -> 60, master.recover.recover_data_buffer_size_mb -> 4, master.scheduler.get_server_statistic_period_s -> 10, master.recover.recover_rpc_call_timeout_ms -> 10000, master.raft.max_batch_size_b -> 2097152, master.rpc_service.master_service_port -> 9630, master.rpc_client.rpc_callback_thread_num -> 8, master.scheduler.get_all_tablets_period_s -> 1800), master.rpc_client.rpc_timeout_ms -> 10000, master.container.limits.memory -> -1, domain -> dc=tdh, master.container.limits.cpu -> -1, tabletserver.tabletserver.store_conf -> /etc/shiva1/conf/store.conf, idc246-046 -> Map(master.master.data_path -> /vdir/mnt/sata1/shiva-master/data, store.capacity -> 4, tabletserver.store.datadirs -> /vdir/mnt/sata1/shiva-tabletserver/data,/vdir/mnt/sata2/shiva-tabletserver/data,/vdir/mnt/sata3/shiva-tabletserver/data,/vdir/mnt/ssd/shiva-tabletserver/data), master.rpc_client.rpc_work_thread_num -> 8, tabletserver.rpc_client.rpc_timeout_ms -> 10000, webserver.container.limits.cpu -> -1, tabletserver.store.disk_write_thread_num -> 1, store.conf -> Map(), master.rpc_service.master_service_thread_num -> 1, master.log.log_max_size -> 1024, tabletserver.container.requests.memory -> -1, master.rpc_service.monitor_service_thread_num -> 8, tabletserver.log.log_file_num -> 10, master.log.log_level -> 5, shiva-restful.sh -> Map(http.port -> 4567), webserver.memory.ratio -> -1, tabletserver.raft.disruption_threshold -> 3, master.raft.heartbeat_period_ms -> 5000, master.raft.election_threshold -> 5, master.rpc_service.raft_service_thread_num -> 2, tabletserver.rpc_client.rpc_callback_thread_num -> 4, kdc -> Map(hostname -> idc246-045, port -> 1088), topology.conf -> Map(topology.topology.rack -> rack1), idc246-047 -> Map(master.master.data_path -> /vdir/mnt/sata1/shiva-master/data, store.capacity -> 4, tabletserver.store.datadirs -> /vdir/mnt/sata1/shiva-tabletserver/data,/vdir/mnt/sata2/shiva-tabletserver/data,/vdir/mnt/sata3/shiva-tabletserver/data,/vdir/mnt/ssd/shiva-tabletserver/data), http.port -> 4567, master.log.log_dir -> /var/log/shiva1, id -> 9, master.scheduler.dead_server_timeout_s -> 60, roles -> Map(SHIVA_TABLETSERVER -> List(Map(id -> 48, hostname -> idc246-045, ip -> 192.168.110.45, toDecommission -> false), Map(id -> 49, hostname -> idc246-046, ip -> 192.168.110.46, toDecommission -> false), Map(id -> 50, hostname -> idc246-047, ip -> 192.168.110.47, toDecommission -> false)), SHIVA_MASTER -> List(Map(id -> 44, hostname -> idc246-045, ip -> 192.168.110.45, toDecommission -> false), Map(id -> 45, hostname -> idc246-046, ip -> 192.168.110.46, toDecommission -> false), Map(id -> 46, hostname -> idc246-047, ip -> 192.168.110.47, toDecommission -> false)), SHIVA_WEBSERVER -> List(Map(id -> 61, hostname -> idc246-047, ip -> 192.168.110.47, toDecommission -> false))), masterPrincipal -> , tabletserver.rpc_service.tablet_service_thread_num -> 8, tabletserver.raft.max_batch_size_b -> 2097152, tabletserver.rpc_client.rpc_work_thread_num -> 4, master.scheduler.bulk_gc_period_s -> 60, master.recover.recover_data_buffer_size_mb -> 4, master.scheduler.get_server_statistic_period_s -> 10, webserver.container.requests.memory -> -1, tabletserver.log.log_dir -> /var/log/shiva1, idc246-045 -> Map(master.master.data_path -> /vdir/mnt/sata1/shiva-master/data, store.capacity -> 4, tabletserver.store.datadirs -> /vdir/mnt/sata1/shiva-tabletserver/data,/vdir/mnt/sata2/shiva-tabletserver/data,/vdir/mnt/sata3/shiva-tabletserver/data,/vdir/mnt/ssd/shiva-tabletserver/data), tabletserver.log.log_level -> 5, master.container.requests.cpu -> -1, realm -> TDH, webserver.memory -> 16, tabletserver.raft.election_threshold -> 5, master.recover.recover_rpc_call_timeout_ms -> 10000, master.raft.max_batch_size_b -> 2097152, master.rpc_service.master_service_port -> 9630, tabletserver.rpc_service.raft_service_thread_num -> 4, webserver.container.requests.cpu -> -1, tabletserver.container.requests.cpu -> -1, tabletserver.tabletserver.max_tablet_reader_queue_size -> 3000, tabletserver.rpc_service.manage_service_port -> 8002, master.container.requests.memory -> -1, master.rpc_client.rpc_callback_thread_num -> 8, sid -> shiva1, master.scheduler.get_all_tablets_period_s -> 1800, tabletserver.recover.recover_data_buffer_size_mb -> 4), dependencies -> Map(TOS -> Map(keytab -> /etc/tos/conf/tos.keytab, tos.master.apiserver.secure.port -> 8553, tos.master.dashboard.username -> dashboard, plugins -> List(), tos.master.controller.port -> 10252, auth -> simple, domain -> dc=tdh, tos.slave.kubelet.port -> 10250, idc246-046 -> Map(tos.master.etcd.initial.cluster.state -> new), tos.master.etcd.heartbeat.interval -> 250, tos.master.apiserver.port -> 8080, kdc -> Map(hostname -> idc246-045, port -> 1088), idc246-047 -> Map(tos.master.etcd.initial.cluster.state -> new), id -> 1, tos.registry.ui.port -> 8081, roles -> Map(TOS_REGISTRY -> List(Map(id -> 4, hostname -> idc246-045, ip -> 192.168.110.45, toDecommission -> false)), TOS_MASTER -> List(Map(id -> 5, hostname -> idc246-045, ip -> 192.168.110.45, toDecommission -> false), Map(id -> 6, hostname -> idc246-046, ip -> 192.168.110.46, toDecommission -> false), Map(id -> 7, hostname -> idc246-047, ip -> 192.168.110.47, toDecommission -> false)), TOS_SLAVE -> List(Map(id -> 1, hostname -> idc246-045, ip -> 192.168.110.45, toDecommission -> false), Map(id -> 2, hostname -> idc246-046, ip -> 192.168.110.46, toDecommission -> false), Map(id -> 3, hostname -> idc246-047, ip -> 192.168.110.47, toDecommission -> false))), masterPrincipal -> , tos.master.dashboard.password -> password, idc246-045 -> Map(tos.master.etcd.initial.cluster.state -> new), realm -> TDH, tos.master.etcd.port -> 4001, tos.registry.port -> 5000, tos.slave.kubelet.healthzport -> 10248, tos.master.etcd.election.timeout -> 1250, tos.master.leader.elect.port -> 4002, sid -> tos, tos.master.scheduler.port -> 10251), LICENSE_SERVICE -> Map(keytab -> /etc/transwarp_license_cluster/conf/license_service.keytab, syncLimit -> 5, zoo_cfg -> Map(maxClientCnxns -> 0, tickTime -> 9000, initLimit -> 10, syncLimit -> 5), plugins -> List(), auth -> simple, zookeeper.jmxremote.port -> 9922, domain -> dc=tdh, idc246-046 -> Map(zookeeper.client.port -> 2291, zookeeper.leader.elect.port -> 3988, zookeeper.peer.communicate.port -> 2988), zookeeper.container.requests.memory -> -1, kdc -> Map(hostname -> idc246-045, port -> 1088), zookeeper.container.limits.cpu -> -1, idc246-047 -> Map(zookeeper.client.port -> 2291, zookeeper.leader.elect.port -> 3988, zookeeper.peer.communicate.port -> 2988), id -> 2, maxClientCnxns -> 0, roles -> Map(LICENSE_NODE -> List(Map(id -> 8, hostname -> idc246-045, ip -> 192.168.110.45, toDecommission -> false), Map(id -> 9, hostname -> idc246-046, ip -> 192.168.110.46, toDecommission -> false), Map(id -> 10, hostname -> idc246-047, ip -> 192.168.110.47, toDecommission -> false))), masterPrincipal -> , zookeeper.server.memory -> 256, zookeeper.container.limits.memory -> -1, zookeeper.container.requests.cpu -> -1, zookeeper.memory.ratio -> -1, idc246-045 -> Map(zookeeper.client.port -> 2291, zookeeper.leader.elect.port -> 3988, zookeeper.peer.communicate.port -> 2988), realm -> TDH, initLimit -> 10, tickTime -> 9000, sid -> transwarp_license_cluster), GUARDIAN -> Map(keytab -> /etc/guardian/conf/guardian.keytab, guardian.ds.root.password -> admin, guardian.server.kerberos.password -> xmSJWu9V3RgpQyq2P0Bv, plugins -> List(), guardian.apacheds.ldap.port -> 10389, auth -> simple, domain -> dc=tdh, guardian.ds.realm -> TDH, guardian.server.port -> 8380, guardian.server.audit.level -> ADD,UPDATE,DELETE,LOGIN, guardian.apacheds.kdc.port -> 1088, kdc -> Map(hostname -> idc246-045, port -> 1088), guardian.apacheds.data.dir -> /guardian/data/, guardian.ds.ldap.tls.enabled -> false, id -> 16, guardian.client.cache.enabled -> true, roles -> Map(GUARDIAN_APACHEDS -> List(Map(id -> 87, hostname -> idc246-045, ip -> 192.168.110.45, toDecommission -> false), Map(id -> 88, hostname -> idc246-046, ip -> 192.168.110.46, toDecommission -> false)), GUARDIAN_SERVER -> List(Map(id -> 89, hostname -> idc246-045, ip -> 192.168.110.45, toDecommission -> false), Map(id -> 90, hostname -> idc246-046, ip -> 192.168.110.46, toDecommission -> false))), masterPrincipal -> guardian/guardian, guardian.admin.password -> admin, guardian.cache.repli.bind.port -> 7800, realm -> TDH, guardian.server.tls.enabled -> true, guardian.ds.domain -> dc=tdh, guardian.admin.username -> admin, guardian.server.audit.enabled -> true, sid -> guardian)))
tos registry hostname in Service(9,Some(1),SHIVA,None,transwarp-5.1.0-final,INSTALLED,shiva1,Shiva1,KUBERNETES,true,true,false) 's dependencies is idc246-045
generated shiva-tabletserver.yaml on [Manager]
start to create role(s) on [Manager] using kubectl --server=https://127.0.0.1:6443 --certificate-authority=/srv/kubernetes/ca.crt --client-certificate=/srv/kubernetes/kubecfg.crt --client-key=/srv/kubernetes/kubecfg.key scale --replicas=3 -f /var/lib/transwarp-manager/master/content/resources/services/shiva1/shiva-tabletserver.yaml…
role(s) successfully created on [Manager]
Mon Jan 15 10:12:19 CST 2018 [Manager] Task local part ended
Waiting SHIVA_TABLETSERVERs in Shiva1 to become Healthy within 600 seconds …
Latest health check result of roles:
DAEMON_CHECK DOWN at Mon Jan 15 10:14:52 CST 2018
SHIVA_TABLETSERVER on idc246-045 has Pod shiva-tabletserver-shiva1-3669319581-f7csl with status CrashLoopBackOff
DAEMON_CHECK DOWN at Mon Jan 15 10:14:52 CST 2018
SHIVA_TABLETSERVER on idc246-046 has Pod shiva-tabletserver-shiva1-3669319581-6zk70 with status CrashLoopBackOff
DAEMON_CHECK DOWN at Mon Jan 15 10:14:52 CST 2018
SHIVA_TABLETSERVER on idc246-047 has Pod shiva-tabletserver-shiva1-3669319581-gc508 with status CrashLoopBackOff
io.transwarp.manager.master.manager.operation.TaskLocalRunner$DownException: SHIVA_TABLETSERVERs in Shiva1 didn’t become Healthy within 600 seconds


#2

你好,请问你是怎么解决的这个问题,我现在也是报这个错误


#3

我也是这个问题


在线客服
在线客服
微信客服
微信客服