记录一次内存泄露定位

背景

这次问题的出现是公司另外一个项目组同事遇到的。当时情况是大约晚上9点左右grafana发出某个服务出现假死状态，研发人员立即登上服务器检查，发现服务器cpu满载，于是就尝试jstack 查看服务进程的堆栈情况，发现已经无法连接了，查看不到堆栈信息；无奈只能从业务日志入手，发现零星的error日志，分别是上午9点左右，在error日志附近都是写库的日志，明显能看到写库的时间在变长，20多秒/30多秒零星出现。然后立马想到的是有可能是出现了60秒导致了中断异常导致某个地方出现内存泄露，致使cpu满载，jstack 无法查看，顺着error 日志的堆栈信息发现类似以下代码

Queue queue=new LinkedBlockingQueue();

public void consume() {
    try {
        while (true) {
            consumeSomething();
        }
    } catch (InterruptedException e) {
        System.out.println("遇到异常了："+e.getMessage());
    }
}

private void consumeSomething() throws InterruptedException {
        if(queue.isEmpty()){
            return;
        }
        System.out.println("i have consume "+queue.peek());
        if(RandomUtils.nextInt(0,10)==5){
            throw new InterruptedException("test");
        }
}
private void produce(){
    while (true){
            queue.add(RandomUtils.nextInt());
        try {
            TimeUnit.SECONDS.sleep(1);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }
}

public static void main(String[] args) {
    OutMemoryTest outMemoryTest = new OutMemoryTest();
    new Thread(()->outMemoryTest.produce()).start();
    new Thread(()->outMemoryTest.consume()).start();

}

大致意思是一个生产者源源不断的往队列放入消息，另外一个消费者不停的消费该队列。正常情况是没问题的，但当特殊情况18行代码(模拟数据库超时中断异常)成真时，try……catch在while 外面捕获，导致跳出while循环，消费者终止了消费，然而生产者依旧在不断生产数据，出现积压泄露，进而导致一系列假死状态

Post Views: 120

记录一次内存泄露定位

背景

Comments

发表回复取消回复

背景

Comments

发表回复 取消回复

发表回复取消回复